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Abstract: In this paper we discuss quantum computational restrictions on the types 
of thought experiments recently used by Almheiri, Marolf, Polchinski, and Sully to 
^. argue against the smoothness of black hole horizons. We argue that the quantum 

computations required to do these experiments would take a time which is exponential 
in the entropy of the black hole under study, and we show that for a wide variety of 
black holes this prevents the experiments from being done. We interpret our results as 
motivating a broader type of non-locality than is usually considered in the context of 
black hole thought experiments, and claim that once this type of non-locality is allowed 
there is no need for firewalls. Our results do not threaten the unitarity of of black hole 
•i-h evaporation or the ability of advanced civilizations to test it. 
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1 Introduction 



The recently proposed firewall phenomenon [1] has dramatically emphasized the extent 
to which black holes remain interesting and mysterious in quantum gravity. Using 
"reasonable" assumptions about the structure of the quantum theory of black holes, 
the authors of [1], henceforth referred to as AMPS, have argued that unitarity of black 
hole evaporation, along with some limited form of locality, is inconsistent with a smooth 
horizon for an observer falling into an "old" black hole. There has been significant 
discussion of this claim in the literature, but in our view none of the followup work so 
far has decisively challenged the original argument. 

We will review the AMPS argument in section 2 below, but a key point in moti- 
vating their setup is a claim that an infalling observer is able to extract information 
from the Hawking radiation of a black hole prior to falling in. 1 The model black hole 
of Hayden and Preskill [2], also partially reviewed below, suggests that this should be 
possible provided that the observer is able to perform a sophisticated non-local mea- 
surement on the Hawking radiation that has come out so far. In this paper we will 
argue using methods from the theory of quantum computation that this measurement 
can almost certainly not be done fast enough and thus that the AMPS experiment is 
not operationally realizable even in principle. 

As a simple example, consider a Schwarzschild black hole in 3 + 1 dimensions. Its 
entropy is proportional to M 2 in Planck units, and it evaporates in a time proportional 
to M 3 . A would-be AMPS experimentalist thus has to extract information from n ~ M 2 
bits of Hawking radiation in a time T ~ r?l 2 to be able to jump in before the black hole 
evaporates. From a computer science point of view this is very special: the decoding 
needs to be accomplished in a time that scales as a low-order polynomial in n, which 
is rarely possible even for highly structured codes. In order to get at the information 
our experimentalist would need to apply a unitary transformation to the Hawking 
radiation which "unscrambles" the desired information by putting it into an easily 
accessible subfactor of the Hilbert space. 2 As we will review below in section 3.2, 
applying a generic unitary transformation to an n-bit Hilbert space requires time that 
is exponential in n. Only very special unitary transformations can be implemented 
faster, and in this paper we will argue that the decoding operation relevant to AMPS 
is unlikely to be special in this way. In fact we conjecture, although cannot rigorously 

-'^As emphasized by AMPS, the observer does not actually need to do the experiment to get into 
trouble. The possibility of the experiment being done is enough to argue against a smooth horizon. 
We will be more precise below about what is meant by "extract information" . 

2 In line with standard parlance we will sometimes refer to this operation as "decoding"; we will 
see the precise connection to what is called decoding in quantum information theory in section 4. 
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prove, that the decoding time will in general be exponential in the entropy. 3 

In light of our discussion, a firewall enthusiast might nonetheless argue that even 
though the decoding cannot be done the information is still "there". We are sympa- 
thetic to this point of view, but consider it to be outdated. It has been clear for a long 
time now that operational constraints are important in understanding the structure of 
the Hilbert space used in describing black hole evaporation, and we view our results 
in this context. The real issue at stake is for which types of questions we should trust 
effective field theory (EFT). Traditionally EFT was viewed as holding away from local 
regions with high energy density or spacetime curvature. If this is the only way in 
which EFT can break down however, then we seem to be led inexorably to information 
loss [3]. If we wish to maintain belief in unitarity, as AdS/CFT strongly suggests we 
should, then there must be a more broad set of criteria for when EFT may not be 
valid. It is not trivial however to find such criteria which do not flagrantly violate 
the extraordinary level to which EFT has been experimentally tested. Careful analysis 
of thought experiments near black holes in the mid 1990's [4-6] led to an additional 
criterion involving causality: 

• Two spacelike-separated low-energy observables which cannot both be causally 
accessed by some single observer do not need to be realized even approximately 
as distinct and commuting operators on the same Hilbert space. 

This criterion was claimed to preclude the apparent contradiction between unitarity 
and local EFT in Hawking's argument. It is clear that it does not lead to obvious 
testable violations of EFT, and it was also claimed to avoid more subtle problems like 
quantum cloning and unacceptably large baryon number violation. The key point for 
us however is that this criterion, which is a profound statement about the structure of 
the Hilbert space of quantum gravity, was motivated by operational constraints. It says 
that whether or not quantum information is "there" is indeed related to its practical 
accessibility. 

The deep insight of AMPS is that even with this stronger causal restriction on when 
we may use effective field theory there is still a paradox that seems to require further 
modification of the rules, either by having firewalls or by further violating effective 
field theory. We interpret our results as supporting a new criterion for the validity of 
effective field theory: 

3 It may seem that beating n 3 / 2 with e" is overkill, but we will show that for more general types of 
black holes the exponential really is necessary to prevent the AMPS experiment from being done. In 
those cases it will be recurrence phenomena rather than evaporation which doom the experiment. 
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• Two spacelike-separated low-energy observables which are not both computation- 
ally accessible to some single observer do not need to be realized even approxi- 
mately as distinct and commuting operators on the same Hilbert space. 

By computationally inaccessible we mean that one or both of them is so quantum- 
mechanically non-local that measuring it would require more time and/or memory than 
the observer fundamentally has available. This criterion clearly implies the previous 
one, but it is stronger: as we will see, it can apply even if both operators are within 
the past lightcone of some observer. In Minkowski space this criterion (and also the 
causality criterion) is irrelevant, but in spacetimes with singularities there are observers 
available time is fundamentally limited. This is also true in spacetimes where the 
fundamental Hilbert space is effectively finite, for example the de Sitter static patch. 
In that case recurrence phenomena limit how much time is available for low-energy 
observation rather than a singularity. 4 By violating what AMPS call postulate II, 
we claim that this criterion removes the need for firewalls. Lest the reader worry we 
will throw the baby out with the bathwater, we observe that a traditional asymptotic 
observer at infinity, whom we will refer to as Charlie, has all the time and memory 
needed to measure the Hawking radiation as carefully as he likes. Our arguments 
are thus no threat to the unitarity of black hole evaporation as a precise quantum 
mechanical process; there will be new restrictions only for an observer we call Alice 
who falls into the black hole before it evaporates. 5 

Both the weaker "causality" criterion and the stronger "computational" criterion 
are negative statements; they tell us what the Hilbert space is not. It is of course 
very important to understand what the structure of the Hilbert space is, and there 
are two interesting proposals. The first, sometimes called "strong complementarity", 
takes the point of view that each observer has her own quantum mechanical theory, 
which is precise for some special observers and approximate in general. 6 There are then 

4 In [7] these limits were used to conjecture that "precise" descriptions of spacetime require observers 
who have access to an infinite amount of information. This conjecture is distinct from the idea proposed 
here, but they are clearly related. 

5 There could be "cosmological" restrictions on what Charlie is able to do, but it seems that these 
should be decoupled from restrictions "intrinsic" to the black hole. From our discussion of Ud yn in 
section 3.3 it seems that in a completely pristine environment Charlie should even be able to test 
unitarity in polynomial time; we won't address whether or not Charlie would be able to do this in a 
"noisy" environment. 

6 This general point of view has been advocated for a while by Banks and Fischler, who try to 
realize it more concretely in a formalism called "holographic spacetime" [8, 9]. In their setup quantum 
mechanics is precise for all observers, even those who encounter singularities or recurrences. They 
have recently argued that their formalism evades the firewall argument [10], but their claim requires a 
decoupling in Charlie's description of the black hole dyanamics between the near-horizon field theory 
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consistency conditions between the different theories to ensure that observers who can 
communicate with each other agree on the results of low-energy experiments visible to 
them both. Within this framework it was argued [12, 13] that if the AMPS experiment 
cannot be done, the firewall argument breaks down. 7 The basic point is that physical 
restrictions on what measurements can be done weaken the overlap conditions, allowing 
for more "disagreement" between Alice and Charlie's quantum mechanical descriptions 
of what is going on. 

The other proposal for the Hilbert space structure, which might be called "standard 
complementarity", claims that there is a single Hilbert space in which states undergo 
exact unitary evolution. The quantum descriptions of various observers are embedded 
into this single Hilbert space as sets of operators that approximately commute within 
a given set but do not necessarily commute between different sets. This viewpoint is 
essentially that of [4-6], and in the context of firewalls it is sometimes called 11 A = Rb" 
for reasons we will soon see. It has been suggested by several people 8 as a way out of 
firewalls, but it has so far run into various paradoxes involving apparent cloning and 
acausality [12, 16]. We postpone further discussion of these two options until after we 
present the firewall argument, but it seems that in either framework our computational 
argument may be sufficient to avoid the paradoxes without any need for firewalls. 

It is also interesting to think about whether more general types of black holes have 
firewalls. For example Reissner-Nordstrom black holes semiclassically seem to take 
an infinite amount of time to evaporate, apparently allowing ample time for quantum 
computation prior to jumping in. We will explain however that the well-known "frag- 
mentation" phenomenon [17, 18] destroys the black hole well before the computation 
can be completed. Big AdS black holes do not evaporate at all, so the AMPS argument 
does not directly apply to them, but arguments have been put forward suggesting that 
they nonetheless have firewalls. In particular Don Marolf has argued that one could 
simply mine the black hole until half of its entropy is gone, after which the mining 
equipment would play the role of the Hawking radiation in the original AMPS argu- 
ment. If the decoding time is indeed exponential in the entropy of the black hole, 
as we argue it is, then it becomes comparable to the Poincare recurrence time of the 
AdS-Schwarzschild space [19]; we argue that no observer or computer can isolate itself 
from a big black hole for so long in AdS space. 

modes and the horizon degrees of freedom which we find rather implausible. Especially in the context 
of the mining operations of [1, 11] 

7 In [13] one of us tried to argue, for a reason having nothing to do with computation, that the 
experiment cannot be done. That argument proved unconvincing, and we regard the computational 
complexity arguments of this paper to be much stronger. 

including but probably not limited to [12, 14, 15]. 
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The strictest test of our criticism of the AMPS experiment uses a setup suggested to 
us by Juan Maldacena, in which a large AdS black hole is placed very far down a throat 
whose geometry is asymptotically Minkowski. By putting the decoding apparatus out 
in the Minkowski region, it seems that one could use the redshift to arbitrarily speed 
up the decoding time compared to the evaporation time. We will explain in section 5 
however that the decoupling which makes the decay slow in this situation also makes 
it very difficult to send the results of the computation back down the throat, and for 
a particular example we show that for a wide variety of probes the time required to 
successfully send a message down the throat is longer than the recurrence time of the 
black hole down the throat. So indeed it seems there is a fairly robust conspiracy 
preventing the AMPS experiment from being done. These results are consistent with 
a point of view expressed by Aaronson [20] that the laws of physics should not permit 
computational machines that radically alter the basic structure of complexity theory. 
At most, they should force some marginal changes around the edges, as in the case of 
Shor's factoring algorithm. 

It is interesting to note that if our computational argument is correct, it supersedes 
many of the classic black hole thought experiments [2, 4-6]. In particular the argument 
of [2] that the scrambling of information by a black hole in a time no faster than 
M log M is necessary to prevent observable cloning would be no longer be needed. 

This introduction has telegraphically sketched our main points. In the remainder 
of the paper we will make the case again in much more depth. Because of our expected 
audience we will try to keep our discussions of quantum computation and coding self- 
contained, but the same definitely cannot be said for our discussions of black holes and 
gravity. Other work on firewalls includes [21-29]. 

2 The Firewall Argument 

We begin with a somewhat reorganized presentation of the original argument of [1]. 
Their argument rests on some basic assumptions about the quantum description of 
a black hole from the points of view of an external observer Charlie and an infalling 
observer Alice, and we will try to be clear about what these assumptions are. The 
argument has many fine technical points, and we will not address all of them. Our goal 
is to motivate equation (2.4), on which the rest of our paper will be based. 

2.1 A Quantum Black Hole from the Outside 

We'll first discuss Charlie's description, which is based on the following three postulates: 

• According to Charlie, the formation and evaporation of the black hole is a unitary 
process. Moreover, in addition to an asymptotic S-matrix, we can also think about 
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either continuous or discrete time evolution in which at any given time there is a 
pure quantum state |^) in some Hilbert space % utside- 

• At any given time in this unitary evolution we can factorize T-L utside into subfactors 
with simple semiclassical interpretations: 

Woutside = 'Hh ® Hb ® Wr- (2.1) 

Here T-Lr are the modes of the radiation field outside of the the black hole, roughly 
with Schwarzschild coordinate radius r > 3GM. Hb are the field theory modes 
in the near-horizon region, roughly with support over 2GM + e < r < 3GM 
where e is some UV cutoff. The geometry in this region is close to Rindler 
space. Tin are the remaining degrees of freedom in the black hole, which we can 
heuristically think of as being at the stretched horizon at r = 2GM + e. Clearly 
the distinctions between these subfactors are somewhat arbitrary, in particular 
it will be convenient to restrict the modes in He to have Schwarzschild energy 
less than the black hole temperature T = 47r g M . Those with higher energy are 
not really confined to the near-horizon region and we will include them as part 
of Hr. The time evolution of |\&) does not respect this factorization and cannot 
be computed using low energy field theory, but for our purposes it is enough to 
consider the state at a given time. 

• If | if | and \B\ are the dimensionalities of T-Lh and Hr respectively, then log \H\ 
and log \ B\ are both proportional to the area of the black hole horizon in Planck 
units at the time at which we study |\&). Thus their size decreases with time. 
Naively Hr is infinite dimensional since all sorts of things could be going on 
far from the black hole, but we will restrict its definition to only run over the 
subfactor which in \^f) is nontrivially involved in the black hole dynamics. Thus 
the size of %r grows with time. 

Assumption three leads to an interesting distinction between "young" and "old" black 
holes [2], with the separation based on whether \R\ is bigger or smaller than |if||.B|. 
When the black hole is young, \R\ is quite small and B and H are entangled significantly. 
As the black hole becomes old however, \R\ becomes large and B and H taken together 
become a small subsystem of the full Hilbert space T-L ou tside- Page's theorem [30] then 
suggests that the combined system BH has a density operator which is close to being 
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proportional to the identity operator: 



Pbh « tE^tttJb ® Ih- (2.2) 
\t)\\ri\ 

The time beyond which this is true has come to be called the Page time. More carefully 
we would expect a thermal distribution in the Schwarzschild energy at the usual tem- 
perature T = An Q M , but since we have put high-frequency modes in %r the thermal 
density matrix for 7i H £g> % B is quite close to (2. 2). 10 

We can describe the state concisely by saying that BH is maximally entangled 
with a subspace in R. More precisely, there is a |\l/)-dependent decomposition of T-Lr 

7~Lr = (Hr h ® H-R B ) © 'Hother, (2.3) 

with \Rh\ = \H\ and \Rb\ = |-E>|, such that we can write the state of the full system, 
to a good approximation, as 11 

l*> = \-4m E l*>*l*>«* ) ® ( -^5f E i 6 > fl i 6 >« fl J • ^ 2 - 4 ) 



Here /i and 6 label orthonormal bases for and respectively, and we have cho- 
sen convenient complementary bases for T-Lr h and %r b . Rh and Rb are called the 
purifications of H and -B respectively. The state has zero projection onto H ther- 

This form of \^) makes it clear that any measurement done on B is perfectly corre- 
lated with some other measurement done on Rb- This consequence of the entanglement 
was emphasized in [1]; in their language measurements done on Rb project onto par- 
ticular states in the basis \o)b- We hope that it is clear however that the presence of 
this entanglement does not require any such measurement to be done; once we accept 
the three assumptions the entanglement follows directly. Indeed we would argue that 
Charlie's ability to measure Rb provides justification for accepting the Hilbert space 
structure of the model. 12 



9 Page's theorem says that in a Hilbert space which can be factorized into I-La^Hb! with \A\ < \B\ 

\A\ 
2\B\ 



a typical pure state has Sa = log \ A\ — + . . .. Here . . . are terms that vanish faster in the limit 
\A\ < \B\ with both |A| and \B\ large. 

10 It is important in what follows that these "low-energy" modes can have quite high proper energy 
near the horizon, so they are relevant to the experience of an observer who is near the horizon even 
though the Schwarzschild temperature is typically very small compared to any scale relevant to that 
observer. 

n This representation of the state is called the Schmidt decomposition. 

12 Remember Charlie stays outside the black hole and has an arbitrarily large amount of time and 
resources, so there seems to be no limit on his experimental ability. This will be different for Alice, 
whom we discuss now. 
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2.2 A Quantum Black Hole from the Inside 

We now consider Alice the infalling observer's point of view. As mentioned in the 
introduction, it may be possible to "embed" Alice's quantum mechanics into Charlie's 
via some sort of nontrivial operator mapping. We will discuss this eventually but for 
the moment will just treat Alice's theory as an independent construction. Here are the 
basic assumptions about it: 

• Although Alice eventually hits the singularity, we imagine that well before that 
she has an approximately quantum mechanical description of her experiences in 
terms of a quantum state on a time slice like the one shown in figure 1. We will 
not insist that the state be pure. 

• Alice's Hilbert space also has a roughly semiclassical factorization of the form 



Here He and 1-Lr are factors shared with Charlie, since they are outside the black 
hole horizon and are causally accessible to both Charlie and Alice. Ha are the 
field theory modes just inside the horizon, say with support over GM < r < 
2GM — e. %h' are the remaining degrees of freedom having to do with Alice's 
horizon (which is distinct from the black hole horizon). T-Lh is absent; the region 
2GM — e < r < 2GM + e is passed through by Alice in an extremely short 
period of time and does not have any operational meaning to her. Of course, at 
times long before she falls in, the black hole horizon is indistinguishable from her 
horizon, and %h is roughly part of 'Kn 1 ■ We emphasize however that the details 
of this accounting don't matter. 

• Because He and Hr are shared with Charlie, we must have p^R^ = p B C R harl%e ^ . 
This is sometimes called the overlap rule, and it is designed to prevent contradic- 
tions where Alice and Charlie disagree about the results of experiments they can 
communicate about. 

With these assumptions, one can now argue following AMPS that Alice must not 
see a smooth vacuum at the horizon. Recall that the modes in He and Ha are basically 
Rindler modes on two different sides of a Rindler horizon. In the Minkowski vacuum 
state such modes are close to maximally entangled: 



where 0u is the dimensionless Rindler energy; here it is just the ratio of Schwarzschild 
energy to the black hole temperature. This however is problematic from our discussion 



Uinside = 'Ha®'Hb®'Hr® Uh> ■ 



(2.5) 




(2.6) 
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Figure 1. Alice's quantum mechanics, compared to Charlie's. The world inside her horizon 
is drawn in blue and the time slice she quantizes on is in red. For reference Charlie's world 
is in yellow, and the overlap is green. We've chosen Charlie's black slice to coincide closely 
with Alice's near B and R. 

of Charlie. We argued that for an old black hole Charlie should see B being close 
to maximally entangled with Rb-, and by the third assumption about Alice this must 
also be true for her. But entanglement in quantum mechanics is monogomous: such 
entanglement prevents B from also being entangled with A as in (2.6). One way to 
see this more precisely, again following AMPS, who themselves were motivated by a 
similar argument due to Mathur [31], is to note that strong subadditivity requires 

Sabrb + Sb < Sab + Sbr b - (2.7) 

Since Sbr b = this implies that Sab > Sb , which is inconsistent with pab being close 
to the form (2.6). This concludes the AMPS argument; one possible interpretation 
is that the resolution of the contradiction is that there is a "firewall" of high energy 
quanta at the horizon of an old black hole which annihilates any infalling observer. 

2.3 A Way Out? 

A key step in the AMPS argument is that B and Rb are accessible to both Charlie and 
Alice and that therefore they must agree on the entanglement between them. But is 
this really true? In [13] it was argued that it is difficult for Alice to measure B because 
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she passes through it quickly, but the details of that argument have not worked out 
satisfactorily. The much more difficult measurement however is the one on Rb- Rb is 
defined as the subfactor of the Hawking radiation which is entangled with B, but this 
subfactor is presumably very convoluted from the point of view of a basis of Hawking 
quanta that is easy to measure. Probing Rb entails doing quantum measurements 
involving nonlocal quantum superpositions of large numbers of Hawking quanta. Such 
measurements need to be very carefully engineered, and one might expect that this 
engineering takes a significant amount of time. This is no problem for Charlie, who 
has all the time in the world to look at the Hawking radiation, but Alice needs to be 
able to make the measurement fast enough that she can then jump into the black hole 
before it evaporates. In this paper we will argue that Alice simply does not have time 
to do this. 

As a result one can consider modifying her postulated quantum mechanics, for 
example in one of the two directions described in the introduction. In the "strong com- 
plementarity" approach, Alice's quantum mechanics has no direct relation to Charlie's. 
They are related only insofar as they must agree on the results of experiments which are 
visible to both of them. Since operators acting on 1-Lr b are not accessible to Alice, our 
computational criterion for the breakdown of effective field theory allows us to either 
disentangle 7-Lr b from Hb in Alice's theory (but not Charlie's) or more perhaps simply 
to just remove T-Lr b from her Hilbert space altogether. This then "frees up" B to be 
entangled with A, ensuring Alice a smooth journey across the horizon. 

In "standard complementarity" we instead think of Alice's theory as being embed- 
ded in Charlie's. To avoid firewalls one might try to arrange that operators on Alice's 
Ha are really just her interpretations of operators acting on what Charlie would have 
called 'Hr b . As mentioned in the introduction, this idea, which is essentially an en- 
hanced version of the original proposal of [4-6], is referred to as U A = Rb" since the 
entanglement will only work out consistently if we "build" interior operators out of 
exterior operators which already have the correct entanglement with operators acting 
on 1-Lb- Without limitations on Alice's ability to directly measure Rb however it seems 
to lead to paradoxes and has thus been viewed with some skepticism. We view our 
work as potentially restoring the credibility of this proposal. We will discuss this a bit 
more concretely in section 6 below. 

3 The AMPS Experiment as a Quantum Computation 

We now begin our discussion of the decoding problem confronting Alice. We will 
phrase the discussion in terms of Charlie's Hilbert space, since for the moment we are 
following AMPS and granting that Charlie and Alice must agree on the density matrix 
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of Hb <8> "H/?- For simplicity we will throughout model all Hilbert spaces using finite 
numbers of qubits. In the Schmidt basis the state of the old black hole she is interested 
in is given by equation (2.4), but this basis is very inconvenient for discussing Alice's 
actions. From here on we will use exclusively a basis for the radiation field which is 
simple for Alice to work with, and whose elements we will write as 

\bhr) R = |&i . . . b k , h x . . . h m r x . . . r n _ k _ m ) R . (3.1) 

Here there are n = log 2 \R\ total qubits, each of which we assume Alice can manipulate 
easily. &i . . . b k are the first k of these qubits, where k is the number of bits in Hb, 
and m is the number of bits in Hm- We can think of k + m as the number of qubits 
remaining in the black hole. The r« qubits make up the remainder of the modes which 
have non-trivial occupation from the Hawking radiation. Roughly we might expect 
that 

n « S initia i -k-m, (3.2) 

where Si n ui a i is the horizon area in Planck units of the original black hole prior to any 
evaporation. This is something of an underestimate because the computational basis 
is local while the information is non-local, but this "coarse-graining" enhancement 
isn't large. The radiation mostly comes out in s-wave quanta so it is effectively one- 
dimensional. It extends out to a distance L ~ M 3 and consists mostly of quanta whose 
energy is of order 1/M, so its thermal entropy is n w LT = M 2 , which is still of order 
the black hole entropy as we would conclude from (3.2). 13 Perhaps surprisingly the 
black hole makes quite efficient use of the information storage capacity available to it. 
We will see below that as long as Alice intends to jump in while the black hole is still 
of macroscopic size, to make her computation simple she wants n to be as small as 
possible. She thus wants to begin decoding as soon as possible after the Page time. In 
what follows one should thus think of n as being just slightly larger than the entropy 
of the remaining black hole. 

We will adopt standard terminology and refer to the basis (3.1) as the computa- 
tional basis. In the computational basis we can write the state (2.4) as 

|tf> = J— \b)B\h) H U R \bhO) R , (3.3) 
v\ B \\ H \ b , h 

where U R is some complicated unitary transformation on 1-L R . What unitary transfor- 
mation it is will depend on the details of quantum gravity, as well as the initial state 
of the black hole. For simplicity we have defined it to act on the state where all of 

13 We thank Don Page for several useful discussions of this point. 
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the rj qubits are zero. Clearly the challenge Alice faces is to apply U R to the Hawking 
radiation, after which it will be easy for her to confirm the entanglement between Hb 
and %r b . Engineering a particular unitary transformation to act on some set of qubits 
is precisely the challenge of quantum computation, and we will henceforth often refer 
to Alice's task as a computation. 

So far we have been interpreting as the thermal atmosphere of the black hole, 
but to actually test the AMPS entanglement it would be silly for Alice to try to decode 
all of the atmosphere. Indeed the separation between %b and Hh is rather ambiguous, 
and we are free to push some of the atmosphere modes we are not interested in into Hh- 
So from here on we will mostly take k to be 0(n°), ie we will consider the case where 
Alice is only trying to check the entanglement for a few of the bits in the atmosphere. 
This simplifies her computation, because in any event she only needs to implement Ur 
up to an arbitrary element of U(2 n ~ k ) acting on the last n — k qubits of the radiation. 
In other words the set of things she is really after is elements of U(2 n )/U(2 n ~ k ). 

Since the unitary group is continuous it is clear that Alice will not be able to do the 
computation exactly. We thus need a good definition of how "close" she needs to get 
to reliably test the entanglement. One standard way to quantify closeness of operators 
is the trace norm [32], which for an operator A is defined as 

|L4||i = Tr (VZU) . (3.4) 

When A is hermitian this is just the sum of the absolute values of its eigenvalues. The 
motivation for this definition is as follows: say pi and p 2 are two density matrices, and 
II a is a projection operator for some measurement to give result a. Then 

\P x {a) - P 2 {a)\ = |Tr{( Pl -p 2 )II a }| < \\{p x - p 2 )\\ x , (3.5) 

so if the trace norm of the difference of two density operators is less than e then the 
probabilities they predict for any experimental result will differ by at most e. The trace 
norm of their difference is clearly preserved by unitary evolution. 

If both states are pure then the trace norm of their difference has a simple inter- 
pretation. For any two pure states and ^ we can write that 

|* 2 ) = e fa (v^^l*i) + |lx>). (3.6) 

where |%) is orthogonal to a is real, and S is real and positive. A simple calculation 
then shows that 

|| |tf 2 )(tf 2 | - = 5. (3.7) 
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Figure 2. What the computer does. The connecting lines at the top and bottom indicate 
entanglement, and time goes up. The subsystem H goes along for the ride, and after the 
computation its purification is split between R and C in some complicated way. 

3.1 Quantum Computing is Hard 

We begin with a rather formal discussion of Alice's computation to illustrate some basic 
limitations on what is possible; we will take a more standard approach in the following 
subsection. In order to do her computation Alice needs to adjoin the radiation to some 
computer, whose initial state lives in a new Hilbert space He, an d then wait for the 
natural unitary evolution U comp on H R (g) H c to undo U R and put the bits which are 
entangled with B into an easily accessible form, let's say the first k qubits of the memory 
of the computer. We show this pictorially in figure 2. For most of this subsection we 
will fix the amount of time that the computer runs for, meaning that we will take U comp 
to be determined by the laws of physics and thus unchangeable. The only way Alice 
has any hope of getting the computer to do what she wants is by carefully choosing 
its initial state. Without loss of generality she can take this initial state to be pure, 
perhaps at the cost of increasing the size of the computer. We will show that no matter 
how large she makes her computer, it is very unlikely that she will be able to find even 
one initial state which does the computation. 

More precisely what Alice would like to do is find a state \^)c which for all b and 
h evolves as 

U com p ■ U R \bhO) R <g) \9) c h> (something) <g> \b) mem , (3.8) 

where | something) is any pure state of the computer and radiation minus the first k 
bits of the memory. | something) can and will be different for different b and h. To 
estimate how likely it is that such a \^)c exists, we discretize the Hilbert space using 
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the trace norm. In any Hilbert space 7i of dimension d we can find a finite set S e C H 
with the property that any pure state in % is within trace norm distance e of at least 
one element of S e . Such a set is called an e-net, and it is not too hard to get an estimate 
of how many elements it must have [33]. One first observes from (3.7) that half of the 
trace norm difference is weakly bounded by the Hilbert space norm: 

|||* 2 > - 1^)112 = 2 (l - COSOVI " £74) > = 0|||*2)(*2|-I*l)(*l|||ll • 

(3.9) 

Thus an e/2-net for the Hilbert space norm is also an e-net for the trace norm. The 
minimal size of an e/2-net for the Hilbert space norm is the number of balls of radius 
e/2 centered on points on the unit sphere in M? d that are needed to cover it, which at 
large d is proportional to some small power of d times (|) . 14 Intuitively we may 
just think of unitary evolution as an inner-product preserving permutation of the (- ) M 
states. 

Applying this now to our discussion of the computer, for fixed b and h the total 

2| C\ I R\ 

number of possible states that could appear on the right hand side of (3.8) is (-) 

2| Rl I C| 2 — k 

The number of possible |something)'s is (|) . For a given \^f)c, the probability 

that (3.8) holds for all 2 k+m values of b and h is then ^y^\c\\m m ^ k ~^) ^ The number 
of initial states is (f) 2 ' C ', so probability that Alice can find one for which (3.8) holds 



-2|C[(|Je[(l-2-*)-l) 

P=[-) . (3.10) 



is 15 



For any nontrivial k and \R\ = 2 n , it is clear that this probability is extraordinarily 
small. Making the computer bigger just makes it even more unlikely the computation 
can be done! 

What then is Alice to do? One might hope that, although the probability of success 
is small for any given computer size, by searching over many values of \C\ Alice might 
find one that works. This is a bad idea; summing (3.10) over \C\ produces a finite sum 

14 This is a slight overestimate for the size of the trace norm e-net because the Hilbert space norm 
distinguishes between states that differ only by a phase while the trace norm does not. We can fix 
this by taking the quotient of the unit S 2d by the phase to get to a unit CP d_1 , whose volume is just 
a negligible power in d times the volume of S 2d . This quotient effectively sets a = between any two 
states, in which case equation (3.9) tells us that the Hilbert space norm becomes close to one half the 
trace norm. The induced metric CP d_1 inherits from R 2d will then for small e be the same as one half 
the trace norm distance. The upshot is then that the number of balls needed to cover CP d_1 scales 
like (f) 2d - 2 . 

15 In this counting we can easily ignore the constraint that orthogonal states must be sent to orthog- 
onal states. 
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whose value remains exponentially small in \R\. Alice can do better however by varying 
the running time of the computer. This lets her sample a variety of U comp s without 
increasing the size of the Hilbert space. If each U comp is different, than the longest she 
might have to wait to get a U comp that works is 



Although finite, this is unimaginably long for any reasonable system size. For an 



The timescale (3.11) has a simple physical interpretation; it is the quantum recur- 
rence time. This is the timescale over which a quantum system comes close to any 
given quantum state, and as we found here is double-exponential in the entropy of the 
whole system. In doing the computation this way, Alice is simply waiting around for a 
quantum recurrence to do it by pure chance. 

Fortunately for civilization, these simple estimates are not the final word on quan- 
tum computing power. In particular (3.11) does not really hold unless U comp is chosen 
randomly at each time step. To the extent that there is some structure in how U comp 
varies with time, as there is in our world, Alice can take advantage of it to speed up her 
computation. Similarly if the way U comp changes with increasing \C\ also has structure, 
she can use that as well. The lesson of this section however is that without using spe- 
cial properties of the computer-radiation dynamics, no amount of preparation of the 
initial state of her computer will allow Alice to do her computation in any reasonable 
amount of time. In the following two sections we will see that by using such physical 
properties Alice is able beat the double exponential in computer entropy down to a 
single exponential in just the radiation entropy, but we will also argue that that is all 
she gets. 

As a tangential comment it is interesting to note that the result of this section 
is actually special to quantum mechanics; there is a somewhat analogous problem in 
classical coding which can easily be solved by making the computer bigger. Say that 
we have a classical bit string of length n. There are 2 n such strings, but say we are 
interested in some subset of size 2 k . For example this could be the set with a fc-bit 
message in the first k bits and zero for the rest. Acting with some random permutation 
on the space of 2 n strings, we can send these 2 k strings to a set of 2 k scrambled 
"code words", which are analogous to some basis for Rb in the quantum problem. We 
could then imagine adjoining one of our n-bit strings to a c-bit "computer" string, and 
then acting with a given permutation of the 2 n+c states of this larger system. This 
permutation is the analogue of our U comp . The question is then the following: given 
this larger permutation, can we find a single initial string for the computer such that, 





astrophysical black hole in our universe it is something like 10 



years. 
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after the permutation is applied, it will send the set of codewords to a set for which 
the message is again displayed in the first k qubits. It turns out that the answer to 
this question is yes; out of the 

( 2 ^) « 2(" +c " fc+ ^) 2fc (3.12) 

possible sets of codewords there are 2^ n+c ~ k ^ 2k which have the message in the first k bits. 
Thus for a given initial string for the computer the probability that the permutation 
sends a generic set of codewords to a "good" set is 

P « e~ 2 \ (3.13) 

which for fixed k we can easily beat by trying the various 2 C initial states for the 
computer. It does require an c ~ 2 fc -bit computer however. 

3.2 Implementing a General Unitary Transformation with Quantum Gates 

In our world, the result of the previous section, that doing a quantum computation at 
worsts takes a time which is double exponential in the entropy of the computer, had 
better be improved on if we are ever to do any quantum computation at all. It can 
be improved upon of course, and the reason is that locality of interactions makes the 
dependence of U comp on time and computer size very special indeed. 16 Since no quantum 
computer has yet been built we do not know exactly how one might be implemented 
physically, but there is a widely accepted model for quantum computation called the 
quantum circuit model. In the quantum circuit model one imagines having a "quantum 
memory" consisting of n qubits, on which one can easily act with some finite set of 
two-qubit unitarity transformations, called quantum gates, on any two of the qubits. 
The computer builds up larger unitary transformations by applying the various gates 
successively. Interestingly the number of different types of gates needed to generate 
arbitrary unitary transformations with high precision is quite small. In fact, one is 
sufficient provided it generic enough and that it can be applied to any two of the qubits 
(and in either order on those two). A set of gates has this property is called universal. 
A specific set of three gates which is universal is the Hadamard gate, which acts on a 
single qubit as 

ff|o>=i(|0) + |i» 

#|1>4(|0)-Il»> (3-14) 



16 This subsection is entirely pedagogical and contains no original material, good references are 
[34-36]. 
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Figure 3. The standard representations of the three gates described in the text, as well as 
a simple circuit that maps the product basis \b\, 62} to a basis each element of which has the 
two qubits maximally entangled. In the CNOT gate the addition is done at the hollow circle. 

the Z l l A gate which acts on a single qubit as 17 

Z 1/4 |0) =|0) 

^ 1/4 |l)=e^|l), (3.15) 

and the CNOT gate U cnot , for "controlled not", which acts on two qubits as 

U cnat \b 1 ,b 2 ) = \b 1 ,b 1 + b 2 ) (3.16) 

with the addition being mod 2. This gate flips the second bit if and only if the first bit 
is 1. There is a standard graphical notation for representing circuits, which we have 
already used in figure 2, and we illustrate it some more in figure 3. 

We can now ask how many gates are needed to make a complicated unitary trans- 
formation like Ur in equation (3.3). This is a good measure of the amount of time/space 
needed to actually do the computation, since we can imagine that the gates can be im- 
plemented one after another in a time that scales at most as a small power of n. For 
a set of / fundamental gates, the number of circuits we can make which use T total 
gates is clearly 

((2)/) 2 , «(» a /) T - ( 3 - 17 ) 

To proceed further we need some basic idea of size and distance for the unitary 
group. The unitary group on n qubits is a compact manifold of dimension 2 2n , and we 

17 In this equation we see the unfortunate but standard convention in quantum computation theory 
that the Pauli-z operator, usually written as Z, acts as Z\0) = |0) and Z\l) = — The Hadamard 
operator can then be interpreted as switching from the Z eigenbasis to the X eigenbasis. As an 
example it is worth seeing to see how to build the Pauli operators X, Y, and Z out of H and Z 1 ^ 
gates. 
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can parametrize its elements as 



(3.18) 



Here t a are generators of the Lie algebra of U(2 n ), and we can very roughly think of 
the c a 's as parametrizing a unit cube in IR 2 ™. Also roughly we can think of linear 
distance in this unit cube as a measure of distance between the unitaries. For example 
say we wish to compute the difference between acting on some pure state |\&) with two 
different unitary matrices U\ and U2 and then projecting onto some other state X- 

( X |(C/i - 1/2)1*) = (x\ (/ - U 2 U\) U^) « -i{ X \ ^U^). (3.19) 

a 

If the sum of the squares of the <5c a 's is less than e 2 , the right hand side will be at most 
some low order polynomial in 2 n times e. This polynomial is irrelevant as we now see. 

Around each of our (n 2 f) T circuits we can imagine a ball of radius e in IR 2 The 
volume of all the balls together will be of order the full volume of the unitary group 
when 

(n 2 f) T e 22n » 1. (3.20) 
Thus we see that in order to be able to make generic elements of U(2 n ) we need at least 

T~2 2 "logQ (3.21) 

gates, where we have kept only the leading dependence on n and e. As promised, 
because e appears inside a logarithm the crude nature of our definition of distance 
has not mattered. (The more interesting converse to this statement is known as the 
Solovay-Kitaev theorem.) More importantly, we see that the number of gates is now 
only a single exponential in (twice) the entropy. So the quantum circuit model is able 
to do arbitrary quantum computations much faster than our calculation of the previous 
section suggested; this is essentially because locality enables us to dynamically isolate 
parts of of the computer in such a way that we can push the chaos of the system 
into heating the environment (not explicitly modeled here) instead of messing up our 
computation. 

Given that we have so quickly beaten down a double exponential to a single expo- 
nential, one might be optimistic that further reduction in computing time is possible. 
Unfortunately, in our universe that does not seem to be the case. Simple modifications 
of the quantum circuit model such as changing the set of fundamental gates or con- 
sidering higher spin fundamental objects instead of qubits, for example qutrits, make 
only small modifications to the analysis and don't change the main 2 2n scaling. One 
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could imagine trying to engineer gates that act on some finite fraction of the n qubits 
all at once, perhaps by connecting them all together with wires or something, but it 
is easy to see that any such construction requires a number of wires exponential in 
n. One could also try to parallelize by applying gates on non-overlapping qubits si- 
multaneously whenever possible, as well as adding additional "ancillary" qubits. As 
long as the number of extra qubits scales like some power of n, however, it is clear they 
cannot beat the 2 2n . Even with exponentially many ancilla or wires just the travel time 
between the various parts of the computer will be exponential in n. In the face of these 
difficulties the reader might be tempted to try using some sort of exotic nonlocal system 
like a black hole to do the computation, but this would just give up what the circuit 
model accomplished and most likely return us to even worse situation of the previous 
section. In this paper we will adopt widely held point of view that the quantum circuit 
model accurately describes what are physically realistic expectations for the power of 
a quantum computer. Thus if Ur has no special structure, Alice cannot implement it 
(or its inverse) in time shorter than 2 2n 

3.3 Why is Alice's Computation Slower than the Black Hole Dynamics? 

We now turn to the question of whether or not the black hole dynamics constrain Ur 
in any way that could help Alice implement it faster. One thing we know about the 
black hole is that it produces the state (3.3) relatively quickly, in a time that scales 
like n 3//2 for a Schwarzschild black hole. This seems to suggest that Alice might be 
able implement U R quickly by some sort of time-reversal. This turns out not to be the 
case. To explain this we introduce a slightly more detailed model of the dynamics that 
produce the state (3.3). 

To describe the evaporation process it is clearly necessary to have a Hilbert space 
in which the we can have black holes of different sizes. We can write this as 

H = ®Zo (U BH ,n f -n ® n R>n ) . (3.22) 

Here the subscripts n and rif — n indicate the number of qubits in the indicated Hilbert 
spaces. The dimensionality of % is nf2 nf . We can imagine starting in the subspace with 
n = and then in each time-step acting with a unitary transformation that increases 
n by one. We will take the evolution on the radiation to be trivial. The black hole 
becomes old after nf/2 steps. This "adiabatic" model of evaporation assumes equation 
(3.2) is exact and does not involve any energetics but, as discussed below equation 
(3.2), it is not a bad approximation for the decay of a Schwarzschild black hole: the 
number of Hawking quanta produced is of order the entropy of the black hole, and so 
is their coarse-grained entropy. 
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Figure 4. The black hole dynamics for a 7-bit black hole. With each step the subfactor we 
interpret as the radiation gets larger. 

An actual black hole formed in collapse will have some width in energy, which here 
means a width in n, but by ignoring this we can make a further simplification. Starting 
in one of the 2 n f states with n = 0, the evolution never produces superpositions of 
different n. So we can actually recast the whole dynamics as unitary evolution on a 
smaller Hilbert space of dimension 2 n f , but in which the interpretation of subfactors 
changes with time. We illustrate this with a circuit diagram in figure 4. 

With this simplification we can now combine all of the timesteps together into one 
big unitary matrix Ud yn acting on our 2 n f dimensional Hilbert space. The matrix Ur 
appearing in the state (3.3), which we now interpret as having been produced by Ud yn , 
will (unlike Ud yn ) depend rather sensitively on the initial state, and since Alice only 
needs to be able to do the computation for some particular initial state we will for 
simplicity choose it to just have all the bits set to zero. For n > we thus expect the 
following to be true 

U dyn \00000) imt = J—^ V \b) B \h) H U R \bhO) R . (3.23) 

V\ B \\ H \ b, h 

So this equation tells us something about Ur, whose complexity we are interested 
in understanding. To proceed further we need to make some sort of assumption about 
Udyn- This is a question about the dynamics of quantum gravity so we can't say any- 
thing too precise, but for those black holes which are well understood in matrix theory 
[37] or AdS / CFT [38-40] the dynamics are always some matrix quantum mechanics or 
matrix field theory. Theories of this type can usually be simulated by polynomial-sized 
quantum circuits [41-44], so it seems quite reasonable to assume that Ud yn can be gen- 
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erated by a polynomial number of gates. 18 Such circuits are usually called "small", so 
more precisely we want to know the following: does the existence of a small circuit for 
Udyn imply the existence of a small circuit for Ur? If the answer is yes, then our model 
would imply that Alice can decode R R out of the Hawking radiation fairly easily. 

It is clear that acting on the state 1 00000) init we can easily decompose Ud yn into 
UnUmix, where U m i X is a simple circuit that entangles the first four subfactors in 
|00000) ini4 : 

U mix \00000) init = I Yl \b)B\h) H \bh0) R . (3.24) 



b.ii 



Umix is very easy to implement, we can just use the circuit on the right in figure 3 
nf — n times for a total of 2(nf — n) gates. We can then define a new operator 

U R = U dyn Ul lx , (3.25) 

which has the property that 

Ur J— \b)B\h) H \bh0) R = ? =L== Y \b)B\h) H U R \bh0) R . (3.26) 

vl^ll^l b ,h V\ B W H \ b ,h 

Ur can obviously be implemented with a small circuit, and it apparently seems to be 

exactly what Alice needs; she can just apply the inverse circuit to the state (3.3) and 

the decoding is accomplished. Unfortunately for her this does not work. Although the 

operator Ur appears to only act on the radiation, the circuit this construction provides 

involves gates that act on all of the qubits. While she is doing the decoding Alice does 

not have access to the qubits in B and H, so she cannot directly use them. Of course, 

if the circuit really acted as the identity operator on B and H for any initial state this 

would not matter, she could just throw in some ancillary qubits in an arbitrary state to 

replace those in B and H and still use the U R to undo Ur. The problem is that (3.26) 

holds only when Ur acts on the particular state , 1 J2bh \b) B\h) H:\bh0) R . This can 

y\ B \\H\ 

be traced back to the fact that the definition of Ur in the first place depended on the 
initial state of the black hole that Udyn acts on. 

Although Alice cannot use these small circuits to decode the entanglement, she 
can move it around. For example acting with a CNOT gate three times on two qubits, 
switching which qubit is the "control" qubit each time, exchanges the pair: 

|&i, b 2 ) ->• Mi + b 2 ) |& 2 , &! + b 2 ) |& 2 , &!>, (3.27) 



18 Technically this also assumes that the mapping from the "microscopic" degrees of freedom on 
which the quantum mechanics looks simple to the "macroscopic" basis (3.1) is relatively simple. For 
low energy fields outside the horizon this seems plausible to us, for example in AdS/CFT the well- 
known construction of [45] seems to accomplish this in a straightforward way for operators outside the 
horizon. We discuss this more in section (6) below. 
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so by adjoining a set of n ancillary qubits to the state Alice can use this operation 
to achieve 

,. = , \ b )B\h) H U R \bhO) R \000) anc 1 ^ |6>B|/i)H|000) il t7 il |6/iO) anc . 

Vl^ll^l 6> ft VRll-nl &,/» 

(3.28) 

This trivializes the state of the radiation, but of course doesn't really accomplish much 
since testing the entanglement still requires undoing Ur. It does show however that 
Alice can move the quantum information from the radiation to a more "stable" quantum 
memory in a short amount of time. 

The lesson of this section is that because Alice does not have access to all of the 
qubits in the system, she is unable to simply time-reverse the black hole dynamics and 
extract Rb in a time that is polynomial in the entropy. Without such a simple con- 
struction, she will in general be left with no option but to brute-force her construction 
of U R using of order 2 2n gates. 19 It is still possible that some yet-unknown special 
features of black hole dynamics will conspire to provide a simple circuit for Ur, but it 
would be rather surprising. After all there are many ways a unitary could be atypical, 
and most still require exponentially many gates. In the following section we will see 
that for some analogous questions in the theory of error-correcting codes, within the 
context of simple circuits with 0(n 2 ) gates it is possible to run into problems which 
almost certainly take exponential time to solve. These results will unfortunately not 
be directly applicable here, but the intuition they provide is still valuable. We will also 
see that being able to efficiently perform a decoding very similar to Ur would have very 
unlikely implications for the complexity class Quantum Statistical Zero-Knowledge. 



4 Quantum Coding and Error Correction 

The theory of quantum error correcting codes has interesting implications for the AMPS 
experiment, which we discuss in this section. 20 We will review the main points of 
this theory assuming no prior experience with the subject. This will be something 
of a sidetrip from our main exposition, so casual readers may want to skip over this 

19 Of course even if she could just time reverse the black hole, the circuit would still take of order the 
evaporation time to run. In fact since the black hole is old it would probably take longer to run than 
the evaporation. We are not comfortable with this argument as a way out of firewalls however. Often 
when there is a general algorithm that gives a polynomial circuit to do something, special details of 
the problem and tricks like parallelization can be exploited to get polynomial increases in the running 
speed. 

20 A recent paper [46] also discussed the AMPS argument in the language of error correction; their 
use of error correction was quite different from that we discuss here. 
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Figure 5. Quantum error correction. Here S is the system whose state we want to restore, E 
is the environment it becomes entangled with via the interaction U no i se , and A is the ancilla 
it interacts with via U correct- At the end the environment is entangled with the ancilla and 
the system S is in the same quantum state it started in. 

section in a first reading. For the impatient the conclusions of relevance for AMPS are 
summarized at the end of the section. 

4.1 Review of Error Correcting Codes 

Typically quantum systems cannot be isolated from their environment. It is interesting 
to understand to what extent a system which began in an unknown state can be restored 
to that state after it has interacted nontrivially with its environment. Usually this is 
done by introducing another ancillary system and transferring the entanglement with 
the environment from the system whose state we want to restore to the ancillary system. 
We show this pictorially in figure 5. 

Error correction is not always possible. For example, say that the transformation 
U no i se from figure 5 is such that it results in the system S being maximally entangled 
with the environment E. There is no information about its initial state remaining in 
S, and no choice of U correc t will allow recovery. What is perhaps surprising is that 
it is ever possible to restore the initial state after a nontrivial U no i se has acted. To 
see that this can be done, following Shor [47] we consider an arbitrary superposition 
l^} = a+|+) + CL-\— ) of the following two nine-qubit states: 

As a crude model of interaction with the environment, we can imagine that the state 
\^/) will be acted on randomly by a single Pauli operator X, Y, or Z on one of its 
nine qubits. After this can we restore the state without destroying it? As Shor 
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explained in [47], we can. The idea is to measure the following set of eight "check" 
operators: 



Z1Z2, 



Z2Z3, Z4 
X\X 2 X 3 X 4 X 5 X e , 




and 



(4.2) 



Since the states |±) are both eigenstates of eigenvalue one for all eight of these operators, 
this measurement will do nothing to the state Let's say however that interaction 
with the environment has caused the state to be acted on by X\. This commutes with 
the last seven check operators, so their eigenvalues are unaffected, but it will change 
the result of measuring Z\Z 2 to from 1 to —1 for both states |±). It is easy to see 
that none of the other possible single-Pauli errors will give this signature as a result 
of measuring the check operators. We can then "repair" the error up to an overall 
irrelevant phase by acting with X\. Similarly say that the error acts with Z\. This will 
now flip the eigenvalue of X\X 2 X 3 X^X^X^ without affecting any of the other check 
operators. There are now two other single-Pauli errors with the same signature, Z 2 and 
Z 3 , but we can correct any of the three up to an overall phase by acting with Z\ on the 
state. In this manner it is easy to see that any single-Pauli error can be corrected. 21 
This protocol is called the Shor code, and it was the first error correcting code to be 
discovered. 

The Shor code works for two reasons. One is that it is redundant, meaning that 
the number of bits of information it protects is significantly fewer than the number 
of physical bits present. This allows "room" for noise to creep in without disrupting 
the message. The other reason is that it is nonlocal; the information is carried in the 
entanglement between multiple qubits, which protects it against local decoherence and 
depolarization. These observations motivate the general definition of a quantum code 
as a code subspace - a fc-qubit subspace of a larger n-qubit Hilbert space out of which 
we can build states we wish to protect. In the Shor code the code subspace is spanned 
by the states |±). Given a code subspace Tic Q W we can always define an encoding 
transformation U enc with the property that 



with |c) a complete basis for He- To be clear there are n — k zeros on the right hand 
side. 

21 It is not hard to describe this procedure in terms of unitary interaction U correc t with an ancillary 
system. For example to measure Z1Z2 we introduce a single ancilla qubit in the state |0) anc and then 
use two CNOT gates with the first and second qubits being the control bits in the gate. This accom- 
plishes 1 61, 62) |0) an C — > \b\, b2)\bi)anc — ► ^2) |&i + ^)(mc, which writes the result of the measurement 
onto the ancillary qubit. 



c) = £7 enc |ci...C fc ,0...0), 



(4.3) 
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To describe error correction more generally, we first need to say more about how to 
understand the noise generated in a quantum system S interacting with an environment. 
In such a situation we can always write 

U noise \sO) = ^(s'e|[/ noise |sO)|s'e) = ^M e |se>, (4.4) 

s' ,e e 

where |e) is an ortho normal basis for the environment and the operators M e are called 
Kraus operators [48]. They act only on the system S, have matrix elements 

(s'|M e |s) = (s'e|C/ noise |sO), (4.5) 

and obey 

^M\M e = l. (4.6) 

e 

The Kraus operators can be awkward to work with in practice because their defini- 
tion depends on the details of the interaction with the environment, which we usually 
do not know. It is therefore convenient to expand them in a standard basis E a of 
hermitian operators such as the 2 2n distinct products of the Pauli operators 

M e = J2C*eE a , (4.7) 

Q 

and rewrite 

U noise \s}\0) = ^E a \ a )\a). (4.8) 

a 

Here we have defined \a) = ^2 e C ae \e), which are no longer necessarily orthonormal. 

In this language there is a necessary and sufficient condition for when exact error 
correction is possible for a given code [49, 50]. A set S of errors E a is exactly correctable 
if and only if 

(c'\E a Ep\c)=5 cc ,C a p (4.9) 

for any |c), |c') in some orthonormal basis for the code space, for any E ai Ep G £, 
and with the coefficents C a p independent of c. If the Kraus operators produced by the 
interaction with the environment can be written only in terms of E^s in a set 8 with 
this property, then a generalization of the procedure described above for the Shor code 
will always allow the state to be recovered perfectly. 

To build some intuition for this criterion we now consider the set of correctable 
errors for the trivial code, where U enc = 1. The codespace is spanned by states of the 
form 

|c) = | Cl ...c fc 0...0). (4.10) 
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As suggested above we will take the E a 's to be the 2 2n distinct products of Pauli 
matrices, which in general we can write as 



The parameters ati, /3±, . . . take the values or 1. It is easy to see that the largest subset 
£ of these operators that satisfies (4.9) is just the set of all such operators for which the 
first k of the a's and /3's are zero, or in other words the subset which acts trivially on the 
first k qubits. That such errors are correctable is completely unsurprising: they don't 
affect the information carrying bits at all! To perform the error correction procedure 
analogous to what we did for the Shor code, we need to find a set of "check" operators 
which have eigenvalue one on states in the code space; the obvious choice here is just 
the n — k Pauli Z operators acting on the last n — k qubits. By measuring these and 
then flipping any which come out —1 using Pauli X operators, we can clearly repair 
any state of the form 



as long as all E a 's that appear are in £. 22 

This construction is easily extended to the general case of nontrivial U enc . We can 
simply take the set £ of correctable errors for the trivial code and conjugate it by U enc 
to define £' = U enc £U\ nc . The check operators can be taken to be U enc ZiU^ nc for the 
last n — k Pauli spin Z operators, and the repairs can be done using the last n — k 
U enc XiU^ nc 's. This protocol is not unique since for a given code subspace there will be 
many £/ enc 's which satisfy (4.3). 

4.2 Computational Complexity, Stabilizer Codes, and NP-Hardness 

In this section we discuss the computational complexity of the error correction proce- 
dure described in the previous section. An obvious source of computational hardness 
is that in measuring the check operators and repairing any that get flipped, we need to 
repeatedly use the encoding transformation U enc . If this unitary is difficult to imple- 
ment as a quantum circuit, the error correction procedure will be very time-consuming. 
In fact in the situation where the only errors generated by the environment are in the 
correctable set £, this is the only source of hardness since the rest of the correction 
procedure clearly only takes linear time in n. 

The situation is more complicated when the environment can produce errors that 
are not in the set £ , such as flipping any two spins in the Shor code or flipping the 

22 As we saw for the Shor code, the error correction procedure only returns the original state up to 
a phase. The phase is independent of c however, so coherence of superpositions in the code space is 
preserved. 



XpZ? ...X% n Z' 



(4.11) 




(4.12) 



a 
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first spin in the trivial code. In this more realistic case exact error correction is not 
possible, and a would-be quantum repair-person has to look at some model of the en- 
vironment dynamics, extract as much information as she can from the check operators, 
and make her best guess about what repairs to apply. This procedure is sometimes 
called "maximal likelihood decoding", or MLDC for short. It can be a new source of 
computational complexity in doing error correction. In more detail, out of the 2 2n total 
possible errors E a there will be a subset £b a d of size 2 2k which commute with all of 
the check operators and act non-trivially on the code space. In the trivial code these 
"bad" errors are the different combinations of X and Z operators acting on the first k 
qubits. For a general code we can take Sbad to be the image of its trivial code analogue 
under conjugation by U enc . After the error correction described in the previous section 
is complete, the state will be still be of the form (4.12) but with all a's in the sum now 
elements of Sbad- Our repair-person now needs to pick the "most likely" of these 2 2k 
errors, say it is called E max , act on the state with E max again to correct it, and hope 
for the best. 23 Determining which of the 2 2fc is the most likely however can require a 
significant amount of classical computation when k is large. In general it might require 
a search through 2 2fc possibilities, which rapidly becomes computationally prohibitive. 24 
As we now sketch, even for a particularly simple set of error correcting codes and a 
reasonable simple model of the noise this process is indeed known to be HP- complete 
[52, 54], which is the gold standard in evidence that a classical computation requires 
exponential time. Good reviews of the basic properties of N P-completeness and its pos- 
sible relevance for physics are [20, 55]. It will be important later that this additional 
source of computational difficulty is relevant only if k is large. 

The most widely studied quantum error correcting codes are the stabilizer codes 
[56], which are defined by the property that the n — k check operators are all just 
products of Pauli operators of the form (4.11). Clearly both the Shor code and the 
trivial code are stabilizer codes. The reason for the name is that the check operators 
generate a 2 n_fc -element abelian subgroup S of the full group Q of 2 2n products of Pauli 
operators, with the property that with respect to the action of Q on the n-qubit Hilbert 
space S is the stabilizer subgroup of the fc-qubit code subspace. Thus rather than giving 

23 Actually this strategy is not quite optimal [51]. Really she wants to find the most likely set of 
errors which can be repaired by a single repair operator. This is taken into account in the result [52] 
we discuss below, but the main result does not change. 

24 If we use a quantum computer we can use Grover's algorithm [53] to search through 2 2k things in a 
time 2 k , which is faster but still exponential. Also, one could imagine trying to "precompute" the most 
likely error for each particular set of outcomes for the check operator measurements, but the result 
of the precomputation would clearly be exponentially big since there are 2"~ fc possible measurement 
results. Carrying it around would be prohibitively difficult. 
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the codespace explicitly we can instead define it by picking some Abelian subgroup of 
the Pauli group; in practice this is a much more convenient way of defining an error 
correcting code. Stabilizer codes are fairly easy to encode: for any stabilizer code there 
is a relatively simple construction [56] of a circuit of size 0(n 2 ) which implements U enc 
exactly 

To estimate the hardness of maximal likelihood decoding for stabilizer codes we 
need some model of noise. Such a model is usually called a quantum channel. For 
any error E a G Q we can define its weight w a as the number of a^s and /3j's in the 
parametrization (4.11) which are nonzero. A simple model of error probability is then 
that probability of any E a occuring is p Wa . Roughly this says that each X or Z error 
occurs with probability p, while Y errors occur with probability p 2 . 25 It is not hard to 
write down a unitary transformation between a system bit and an environment that 
realizes this explicitly. Roughly, to do MLDC we then need to measure the check oper- 
ators, find the error of lowest weight consistent with the results of these measurements, 
and then correct it. As mentioned in a previous footnote this strategy is not quite 
optimal; really we should find the most likely set of errors which can be corrected by a 
single repair operator and then apply it. 

As explained in [52] either of these strategies can embed the following question 
about matrices: 

• Say we are given an integer w > 0, an (n — k) x n matrix H, and a vector y with 
n — k components, with the components of the latter two being elements of the 
finite field Z 2 . Is there a vector e with n components in Z 2 with the property 
that He = yl 

This problem is called the Coset Weights problem; it appears in classical coding theory 
and was long ago shown to be NP-complete [54]. More precisely Hsieh and Le Gall 
showed that the ability to do MLDC decoding of stabilizer codes in this quantum 
channel in polynomial time in k would allow polynomial in k time solutions of Coset 
Weights as well. Since Coset Weights is NP-complete, this then would allow polynomial 
time solution of any size k problem in the computational class NP. This is still not 
known to be impossible, but the very widely believed conjecture P 7^ NP precludes it. 
For a discussion of the vast support for this conjecture, see for example [57]. 

25 That Y is special is an undesirable asymmetry that is necessary for the proof of [52]. This seems 
to be a technical problem, and we expect that MLDC of stabilizer codes is still NP-complete for the 
"depolarizing channel" , where the probability for a Y occuring is also p. We have not proven this, but 
it seems unlikely that making Y errors more likely would make the decoding easier. 
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4.3 Alice's Task as Error Correction 

Having reviewed all this formalism we now see what it has to tell us about Alice's 
computing task. Originally we had hoped that the results of [52] would allow us to 
demonstrate the NP-completeness of Alice's decoding job, but this turns out not to 
work. Nonetheless we are still able to use the formalism of error correction to make a 
few interesting comments about the practical implementation of the AMPS experiment. 

It is easy to recast what Alice is trying to do as a quantum coding problem [2]. We 
can write the state (3.3) of the black hole as 

i*> = 4^Ei & >^>> ( 4 - 13 ) 



where 



|6> = -^=\h) H U R \bhQ) R (4.14) 
\ \H\ 



is a basis for a k dimensional subspace of %h ® Hr- We can obviously interpret this 
subspace as a quantum code, with encoding transformation 

Uenc = U R U miX)H - ( 4 -15) 

Here U m i X ^H is a simple entangling transformation analogous to U m %x from equation 
(3.24), now entangling only the m qubits of H and the n + 1 to n + mth qubits of 
R. Thus for this quantum code the complexity of implementing U enc is essentially 
equivalent to implementing U R . Exact error correction thus gives no new argument 
for the difficulty of implementing U R . But are all the errors exactly correctable? The 
obvious source of error here is that Alice does not have access to the horizon degrees 
of freedom if, so it seems she must do some error correction. Consider two errors Ei 
and E 2 on T-Lh <S> Hr which act nontrivially only on Hh- From (4.14) we can compute 

(b'lE^b) = -|-<WTr H (EiE 2 ) , (4.16) 

which obeys the necessary and sufficient condition (4.9) for exact quantum error cor- 
rection. This is unsurprising, of course, since we already know that Alice can do the 
decoding exactly just by applying U R to T-i R . It shows clearly however that correct- 
ing errors having to do with not having access to H does not require any additional 
computation of maximum likelihoods. 

There are of course other errors that can occur, having to do with difficulty of con- 
trolling the Hawking radiation. In particular an order one fraction of the radiation will 
be gravitons, which are very hard to detect at all, never mind coherently manipulate. 
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We would be uncomfortable trying to use such "gritty" limitations to resolve such an 
important question as whether or not there are firewalls, but in any case we will argue 
here that even if we wished to do so the argument would not work. The reason is that 
to test the AMPS entanglement, it is already sufficient to only look at a few modes. 
This means that we can take k to be order one and thus that, as explained in the pre- 
vious section, MLDC adds no significant challenge to testing the AMPS entanglement. 
Moreover it was shown long ago by Page [58] that the rate of Hawking radiation into 
a field of spin s decreases as s increases, so for Schwarzschild black holes gravitons will 
only make up some small order one fraction of the radiation. The question of how 
many bits can be lost without losing the ability to error correct with high probability 
of success has been studied in the literature, and the answer is that generically one 
can lose of the bits and still be able to repair the state with high fidelity. 26 For 
example in the erasure channel, which erases a known set of pn qubits with < p < 1, 
a typical stabilizer code with ^ < 1 — 2p is correctable with high probability [35, 56]. 
This means that Alice can lose all of the gravitons and still be able test the entangle- 
ment accurately with room to spare for correcting additional errors! Of course, since 
we do not expect Ur to have a polynomial size circuit, we do not think the code (4.14) 
will actually be a stabilizer code, but more generic codes should protect information 
against local errors like losing gravitons at least as well as stabilizer codes do. 

4.4 Error Correction and Zero-Knowledge Proofs 

We've seen that the known NP-hardness results about decoding quantum error correct- 
ing codes unfortunately can't be invoked directly to draw any firm conclusions about 
Ur, but the difficulty of implementing Ur is actually closely related to another com- 
plexity class known as Quantum Statistical Zero-Knowledge (QSZK). The idea of a 
zero-knowledge proof is best explained by example. 

Consider the problem of determining whether two graphs G\ = (V\, E\) and G2 = 
(V2, -£"2) are isomorphic, that is, whether there exists a permutation n of the vertices 
of G2 turning G2 into G\. There is currently no polynomial time classical or quantum 
algorithm known for the graph isomorphism problem. Suppose, however, that some 
inventive computer scientist claimed to be able to solve the graph isomorphism but 
jealously guarded his secret abilities. Would he be able to convince you that two G\ 
and G2 are isomorphic without revealing any information about the isomorphism? Yes! 

26 There is a simple argument that one cannot lose more than half of the bits. We could just take 
the state and then give two people one half of the bits each. If they could both restore the initial state 
with high probablity from their bits alone then this would be a quantum cloning machine, which is 
impossible. Also we comment that n here is the size of the code subspace, so in our black hole model 
it is actually n + m. 
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The computer scientist begins by randomly permuting the vertices of Gi, sending you 
the resulting graph G 3 . At that point, you flip a coin and, depending on the outcome, 
challenge him to exhibit an isomorphism to either G\ or G*2. He will be able to succeed 
if the graphs really were isomorphic but will necessarily fail half the time otherwise. 
After a few repetitions of the process, you will become convinced of the existence of the 
isomorphism between G\ and Gi without learning anything at all about its structure. 

There are different ways to formalize the idea of a zero-knowledge proof, leading to 
potentially different complexity classes. The version relevant here is known as the class 
Statistical Zero-Knowledge (SZK) [59]. In the quantum mechanical analogue, the two 
participants, usually known as the prover and the verifier, would exchange qubits rather 
than bits, with the resulting class called QSZK [60]. There is absolutely no constraint 
on the computational power of the prover, but the verifier can only perform polynomial 
time quantum computations. Moreover, only a statistically negligible amount of infor- 
mation should leak from the prover to the verifier. QSZK is the set of computational 
problems with yes/no answers for which such a prover can always convince the verifier 
of yes instances but will fail with high probability for no instances. It is known that the 
quantum model is at least as powerful as the classical one: SZK C QSZK [61]. QSZK 
also trivially contains BQP, the class of problems that can be solved on a quantum 
computer: for such problems, the verification can be done directly using the computer 
itself without any need for a clever discussions with a prover. QSZK should therefore 
be understood as the set of problems whose yes instances can be reliably identified 
using a quantum computer with the help of an all-powerful prover, albeit one who is 
both secretive and dishonest. To assert that all the problems in QSZK can be solved in 
quantum polynomial time, that QSZK = BQP, is to assert that the prover is ultimately 
no help at all. 

Given some arbitrary polynomial-sized quantum circuit U no i se acting on three sys- 
tems B, H and R such that \ip) bhr = U no i se \000) bhr with \ip) bhr maximally entangled 
between B and HR, determining whether maximal entanglement with B can be de- 
coded from R is a well-defined computational problem. Call it the Error Correctability 
problem. Note that Error Correctability has a quantum statistical zero-knowledge proof, 
which simply consists of having the prover implement the quantum error correction 
operation on R (not caring that it might take exponential time) and having the verifier 
check the result. In fact, the problem of determining whether noise is correctable in this 
sense is complete for QSZK, meaning that any other problem in QSZK can be efficiently 
mapped onto a version of the error correction problem [62]. 

Suppose now that, given a circuit for an arbitrary correctable U no i se , it were possibly 
to efficiently find and implement the error correction procedure. In that case, the entire 
zero-knowledge proof for Error Correctability described above could be implemented 
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efficiently on a quantum computer. In the case of yes instances, the procedure would 
produce verifiable maximal entanglement with B. In the case of no instances, no 
such entanglement could be produced regardless of the decoding procedure attempted. 
Moreover, since Error Correctability is QSZK-complete, that means that every problem 
in QSZK could be solved on a quantum computer: being able to efficiently decode noise 
whenever it is correctable would imply that QSZK = BQP. (It is important to remember, 
however, that simply determining whether some errors are correctable could be much 
easier than actually correcting them.) 

To make contact with decoding Hawking radiation, the polynomial-sized quantum 
circuit U noise should of course be U^ yn . Crucially, Error Correctability remains QSZK- 
complete even if B consists of only a single qubit, unlike the NP-hardness result for 
stabilizer decoding discussed earlier. The question at the core of this article is whether 
the decoding can be performed in time polynomial in the size of the circuit for Ud yn - We 
have seen that being able to do so would imply that QSZK = BQP if U& yn represented 
arbitrary correctable noise. 

Since the real U^ yn describing black hole evaporation is very special, we could ask 
whether its known properties are so unusual as to undermine the argument. Specifically, 
for sufficiently late times, we expect maximal entanglement between BH and R. It 
is possible to a certain extent to achieve the same thing using arbitrary \iP)bhr. = 
U no ise\000) bhr by simply working with k copies of \ip). The resulting state \ip)® k rapidly 
converges to one with near-maximal entropy concentrated in the "typical subspace" of 
B® k H® k [63]. This property is similar to true maximal entanglement generated by 
Udyn, albeit slightly weaker. 

The conviction that P ^ NP has developed over several decades of research in 
algorithm design and complexity theory. The belief that QSZK-complete problems 
cannot be solved efficiently on a quantum computer is admittedly less well-founded but 
does have some algorithmic and complexity theoretic support. 

Researchers have been working for twenty years on the design of efficient quantum 
algorithms and some problems have stubbornly resisted attack. In particular, Shor's 
factoring algorithm naturally extends to an efficient quantum algorithm for the more 
general Abelian Hidden Subgroup problem [64]. Researchers have been trying consis- 
tently since then to attack the non- Abelian version of the problem but with only very 
limited success [65-67] . (Note that the non- Abelian version includes the graph isomor- 
phism problem discussed above as a special case [68].) Large classes of strategies based 
on Shor's Fourier-sampling approach are known to fail [69, 70]. 

Given all the fruitless effort that has gone into trying to find an efficient quantum 
algorithm for solving the non-Abelian Hidden Subgroup problem, researchers have begun 
to suspect that no such algorithm exists. Moore, Russell and Vazirani took one step 



- 33 - 



further and defined a classical invertible function that is efficient to evaluate but hard to 
invert on a quantum computer under the assumption that there is no efficient quantum 
algorithm for non-Abelian Hidden Subgroup [71]. Their construction is easily adapted to 
rule out efficient quantum algorithms for decoding efficiently encoded quantum error 
correcting codes under the same assumption. Structurally, the function is parametrized 
by a list of m vectors V over F™. The function fy takes a matrix M e GL n (¥ q ) to MV, 
with the output returned as an unordered list, (m is selected to be only slightly larger 
than n, which is sufficient to ensure that the function is injective with high probability.) 

Instead of returning an unordered list, however, the output vectors could equiva- 
lently be ordered but permuted by an unknown permutation tt, which can be used to 
define the following isometry: 

U : \M) A h+ -L V] \it) H \it{MV)) R . (4.17) 

Then (Is <8> U) acting on a state maximally entangled between A and B efficiently 
generates a state maximally mixed on BH with the property that the purification of B 
can be recovered by a unitary acting on R alone, precisely mimicking the key properties 
°f Tlbh W) s\h) hV it\bhO) r. If fv is hard to invert on a quantum computer, however, 
the decoding unitary can't be implemented in polynomial time. Under the assumption 
that there is no efficient algorithm for non-Abelian hidden subgroup, however, fv is hard 
to invert, even for V chosen uniformly at random. 

At the level of complexity theory, there is some evidence that QSZK-complete prob- 
lems cannot be solved using small quantum circuits. (An efficient quantum algorithm 
corresponds to a small circuit that can itself be laid out efficiently, a further requirement 
that should arguably be relaxed in discussions of the AMPS paradox.) It is known that 
determining whether a function is 1-to-l or 2-to-l requires exponentially many calls to 
the function, even for a quantum computer [72]. It is easy to construct a statistical 
zero-knowledge protocol for the problem, however, in the setting in which the prover 
knows the function and the verifier is making queries to try and distinguish the 1-to-l 
and 2-to-l cases. Since SZK C QSZK, the protocol lifts to a quantum statistical zero- 
knowledge proof as well. In this model, therefore, an exponentially large number of 
queries is required to solve the problem using a quantum computer even though it has 
a zero-knowledge protocol. Any demonstration that QSZK has small circuits would 
somehow have to be reconciled with that fact, which essentially rules out any strategy 
which just directly transforms a zero-knowledge protocol directly into a small circuit. 
Instead, the demonstration would need to make essential use of some subtle internal 
structure of the problems contained in QSZK. 27 

27 We thank John Watrous for suggesting this argument. 
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So the conclusion of this section is that to the extent the AMPS experiment is 
difficult, it is because it is hard to implement Ur using a quantum circuit. The practical 
difficulties of doing the experiment, and in particular the problems associated with 
measuring gravitons, only increase the difficulty further. It still interesting to note 
that even if Ur were easy to implement, if Alice for some reason were interested in 
testing the entanglement for a large number k of the modes then the NP-completeness 
of MLDC discussed in the previous section would come into play. Even for fixed k, 
however, the decoding problem is almost certainly at least QSZK-hard and there are 
strong reasons to believe that such problems can't be solved in polynomial time on a 
quantum computer. From the computer science point of view, it would be extremely 
surprising if implementing U R did not require exponential time. 

5 More General Black Holes 

For a Schwarzschild black hole the evaporation time scales like the entropy to the 3/2 
power, which is clearly much too fast for Alice to complete a computational task that 
requires time that is exponential in the entropy. In this section we consider the AMPS 
experiment for some more general classes of black holes. 

5.1 Schwarzschild in a Box 

In order for Alice to have any chance of doing the AMPS experiment given our claim 
of exponential decoding time, she will clearly need some way of slowing down the 
evaporation of the black hole. The simplest thing she could imagine is letting the black 
hole become old and then putting it inside of some sort of reflecting box to prevent it 
from evaporating. 

Closed finite entropy systems however behave very strangely over times of order e s . 
For example over that kind of timescale a gas of particles in a room sometimes finds 
itself collected up in the corner of the room, and other times find itself spontaneously 
assembling into a puddle of liquid on the floor. This is the phenomenon of Poincare 
recurrence [19]. Indeed e s is sometimes called the "classical recurrence time", to be 
contrasted with the "quantum recurrence time" e e which we encountered in section 
3.1. This nomenclature is a little misleading; after all, quantum mechanics is the reason 
that the entropy of the gas is finite in the first place, but what it really means is the 
following: e s is the time scale over which some finite entropy quantum mechanical 
system will be "classically close" to an order one fraction of some orthogonal basis for 
its Hilbert space, up to conservation laws. In other words the probability that repeated 
measurements in that basis over a time of order e s at some point give any particular 
result allowed by conservation laws is order one. The significantly longer quantum 
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recurrence time, by contrast, is the time it takes for the system to get close in the trace 
norm to any particular quantum state. The basic distinction here is that the number 
of elements in a basis of the Hilbert space is e s , while the number of elements in an 
e-net of the type discussed in section 3.1 is e e . 

This means that it will be very difficult to confine the black hole in a box for such 
a long period of time. For example there is an order one probability that the black 
hole will produce a gigantic nuclear warhead and fire it at the side of the box. Or 
another black hole. The black hole will also itself crash into the box every now and 
then. Making the box big to try to avoid such things is not allowed because then the 
black hole will just evaporate into a diffuse gas of particles inside the box and there 
will be no black hole left when Alice finishes her computation and opens the box. 

Of course every string theorist knows how to avoid the problems with putting a 
black hole in a box in Minkowski space: we just put it in Anti de Sitter space! As long 
as the black hole is large enough the reflecting boundary of AdS space will feed its own 
radiation back into it fast enough to prevent it from evaporating. Of course to make 
the AMPS argument at all we need the black hole to become maximally entangled 
with some external system, so following a suggestion of Don Marolf we imagine this is 
done by mining the black hole down to less than half of its initial entropy. 28 But this 
setup brings with it a new problem; we now need to put Alice, her mining equipment, 
and her computer in the box as well! Since the calculation still takes of order the 
recurrence time, this means that both Alice and her assorted paraphernalia now need 
to be resistant to nuclear warheads/mini black holes/etc. Alice could try to avoid this by 
staying very far away from the black hole, namely exponentially near the boundary, so 
that from her point of view the recurrences become effectively low energy enough not to 
affect her. In doing so however she will be fighting against an effective potential pulling 
her back to the center. To do this for enough time to accomplish the computation would 
require exponentially large amounts of rocket fuel, and the exhaust from burning all 
that fuel would fall back into the black hole anyway and pollute her experiment. Trying 
to use angular momentum to stabilize her orbit would not work because over such long 
time scales her orbit would rapidly decay via gravitational radiation. 

In fact there is a way to avoid these problems as well, which was suggested to 

28 It is unclear to what extent this argument can be applied to the eternal two-sided AdS black hole, 
since the thermal bath will make it difficult to extract energy. Indeed of all black holes this seems to 
be the least likely to have a firewall. Its gravity dual is two CFT's which are entangled in just the 
way that seems necessary to produce a smooth horizon [40] . When we discuss AdS black holes we will 
always be imagining one-sided black holes made from some sort of collapse. Interesting previous work 
on the interior of AdS black holes includes [73, 74]; it would be illuminating to understand how to ask 
the firewall question in either of these frameworks. 
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us by Juan Maldacena. It involves putting a big AdS black hole in a throat geometry 
where the near horizon region is asymptotically AdS but there is also an asymptotically 
Minkowski region. In this way the geometry provides a box with the benefits of both 
the Minkowski box and the AdS box without the problems of either. There is still a 
problem with this setup, but it is more subtle and we return to it later in this section. 

5.2 Reissner-Nordstrom 

Another way Alice could try to extend the lifetime of her black hole is by giving it some 
charge. The Reissner-Nordstrom black hole of mass M and charge Q has metric 

dv 2 

ds i = -dt 2 f(r) + Tr - + r 2 dQ 2 2 , (5.1) 
f(r) 

where 
and 

r± = GM± \fG(GM 2 - Q 2 ). (5.3) 



Its entropy is 



and its temperature is 

r + — r_ 

i~ 

We see when M£ P = Q, the black hole is extremal and the temperature is zero. If we 
start the black hole with mass above extremality, it will radiate and gradually approach 
extremality. The relevant point here is that as it does this, the temperature decreases 
and semiclassically it appears that the decay takes an infinite amount of time. More 
precisely if we define the energy above extremality to be 

E = M-Ql~\ (5.6) 

the energy-temperature relation is 

E = 2n 2 Q 3 T 2 £ p (5.7) 

and a simple calculation tells us that, for a black hole which starts with energy E < 
Qip 1 , we have 
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Here a is some order one numerical constant. Thus it appears that the black hole 
takes an infinite amount of time to reach extremality. There is a well-known problem 
with this argument however [17, 18], which is that from (5.7) we see that when the 
temperature reaches 

T ~A (5 - 9) 

the remaining energy E above extremality is no longer bigger than the temperature 
T. At this point the semiclassical description of the evaporation process breaks down, 
and quantum gravity is necessary to understand what happens next. This is related 
to an instability of AdS2 called fragmentation [18], and in the cases where it can 
be understood in string theory it is always true that the geometry breaks apart into 
something that has little resemblance to the original Reissner-Nordstrom geometry. 

It is easy to see that no matter whether we start with E near extremality or much 
larger, this instability always sets in well before the exponential of the initial entropy. 
For example, say that we start with E < Q (setting £ p = 1). The initial entropy is 
of order Q 2 , while the time to reach the instability is of order Q 7 . Alternatively if we 
start with E$^> Q then we are back to the Schwarzschild situation where evaporating 
back down to E ~ Qip 1 takes a time of order El « S^ 2 , and the additional time to get 
down to the instability is Q 7 <C •S'o'' 2 - The total evaporation time is always bounded 
by a polynomial in So- Thus charged black holes, near extremal or otherwise, are of no 
use to Alice. 

5.3 Near Extremal AdS Throat 

Recall the AdSd Schwarzschild geometry 

dv 2 

ds 2 = -f(r)dt 2 + J^ + r 2 d^ 2 d _ 2l (5.10) 

with 

A scalar field on this background feels an effective potential. In particular if we look 
for solutions of the Klein-Gordon equation of the form 

</> = r- d -^e-^Y e (Q d ^Ur), (5.12) 
it is not hard to see that must obey a Schrodinger-type equation 

-■^* u t + V eff (r)* ut = oj 2 * ut . (5.13) 
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Here r* is a "tortoise" type coordinate obeying 

<(r) = f-\ (5.14) 

and the effective potential is 



Vef f (r) = R~ 2 



R\ 2 aR 2 



m2 + ^ lr2 



4/4 r d-3 

(5.15) 

The details of this potential do not matter, but we see that it vanishes linearly at the 
only real positive root of f(r), that is at the horizon, and that it grows quadratically 
with r at large r. Near the horizon the solution then behaves as ~ e iw(±r.-t) ^ w j 1 ji e 
at large r we have ~ r ~ A ± with the usual AdS/CFT formula 

A± = d -^- ± l -^(d- l) 2 + 4R 2 m 2 . (5.16) 

The idea of this section is to cut off this geometry at some large value of r and 
sew it onto Minkowski space, after which the effective potential (5.15) would go back 
to zero provided we set the scalar field mass m 2 to zero. The black hole would then be 
able to decay via massless quanta tunneling out of this potential into the Minkowski 
region, and by choosing the crossover value of r to be large we could adjust the decay 
time independently of the entropy of the black hole. We can also "outsource" the 
computation by putting the computer out in the Minkowski region, which allows us 
to buy a large redshift factor enhancement in the time it takes to do the computation 
from the point of view of Alice living down by the black hole. 

There is a new problem with this construction however, which is that any attempt 
to send the result of the computation from the Minkowski region back down the throat 
to the vicinity of the black hole has to get back through the potential barrier. The signal 
the computer sends down the throat will need to be very low energy, so its absorption 
probability will be exponentially small. Nonetheless one could imagine trying to send 
the signal repeatedly, hoping that eventually one of the times it will get through. This 
approach gets much harder as the size of the signal we wish to send increases, but 
unlike the previous examples it is not obvious that it cannot be done and we now have 
a serious computation to do. 

5.3.1 The Brane Setup 

To test this quantitatively we need a specific example of this type of geometry. In 
string theory there is a standard way of producing throat geometries with the desired 
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properties by stacking branes; for example D3 branes in type IIB string theory or 
M5 branes in M-theory. To realize the geometry (5.10) explicitly we would need a 
configuration of branes which is spherically symmetric and stable. Spherical symmetry 
is a bit inconvenient because branes in spherical configurations tend to collapse under 
their own tension unless there is something else supporting them. A simple thing we 
could do is start with the AdS% x S 3 x T 4 solution of IIB supergravity with RR flux, 
itself the near horizon limit of the Dl — D5 system, and then wrap some D3 branes 
on the S 3 . This throat would not be asymptotically Minkowski, but we could easily 
arrange for the curvature radius of the AdS% to be much larger than the curvature 
radius of the AdS§ near the D3 branes. Rather than try to make this construction 
work in detail, we will instead consider a simpler setup in which the black hole has 
planar symmetry instead of spherical symmetry. 

One of the best-known solutions of ten-dimensional IIB supergravity is the extremal 
planar black 3-brane [75], with metric 

ds 2 = Z(r)- l/2 (-dt 2 + dx 2 ) + Zir) 1 / 2 (dr 2 + r 2 d{l 2 5 ) . (5.17) 

Here 

Z{r) = l+(* y ) , (5.18) 

and R is a parameter of the solution. The string theory interpretation of this solution 
is that it gives the backreacted geometry in the presence of N D3 branes, with 

R 4 = AirgNii ~ Nt\ (5.19) 

where g is the string coupling, l s is the string length, and i v is the ten-dimensional 
Planck length. By looking at Zir) we see that this geometry indeed has the property 
that for r <C R it behaves like AdS$ in Poincare coordinates, times an extra S 5 of 
constant radius, while for r ^> R it becomes ten dimensional Minkowski space. 

To get something like a black hole down the throat we need to put this solution 
at finite temperature, and to get the entropy to be finite we need to compactify the 
spatial x directions. Although it will not be explicit in our equations, we will choose 
anti-periodic boundary conditions for the fermions around the compact dimensions. 29 

29 The reason for this choice is that compactifying with periodic boundary conditions preserves 
supersymmetry, which introduces an instability of the throat. Supersymmetry ensures there is no 
potential energy cost for separating the D3 branes that make up the throat. In CFT language the 
dual gauge theory is unable to pick a vacuum and its zero modes wander freely on its moduli space. 
Antiperiodic boundary conditions break supersymmetry and generate a potential that keeps the branes 
together. We thank Igor Klebanov and Juan Maldacena for discussions of this point. 
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To add some temperature we just need to consider the non-extremal version of the 
solution [76]: 

ds 2 = Z{r)- 1 ' 2 (-f(r)dt 2 + dx 2 ) + Z{r) 1 / 2 (J^ + r 2 dn 2 ^ , (5.20) 



where Z(r) is now 



with 



iT 4 



Z(r) = l + X - , (5.21) 



1 + 7 (^) S -U r ^Y (5-22) 



4 \RJ 2 \R 



r N4 



and 

f(r) = l-{f) ■ (5-23) 

To put the black hole far down the throat we clearly want | < 1, and in this limit 
the entropy of the black hole is 

and the temperature is 

r = ~- (5-25) 

71 R z 

Here f2s is the volume of a unit S 5 and L is the periodicity of the x directions. In the 
same limit the ADM energy 30 is 

E ^ = ^{ RA +l r o)- (5-26) 

One can check that when r — > this reproduces the correct D3-brane tension. The 
energy above extremality is 

£ = ^zr~^f (5 - 27) 

It is straightforward to check that the instability encountered in Reissner-Nordstrom 
does not happen here; in fact we have 

E = -ST, (5.28) 



30 For an asymptotically flat geometry, in coordinates where the metric is r\^ v + h^ v with hn V small, 
the ADM energy is defined [77] as lim r _j. 00 J dAn 1 (djhij — dih? -^j . Here the integral is over the 
§ 5 x T 3 at infinity. 
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so the energy will be bigger than the temperature until the black hole has Planckian 
area. We may then worry that the black hole evaporation time is independent of 
the intial entropy, as we worried with Reissner-Nordstrom, but there is now a new 
phenomenon which comes to the rescue. 

5.3.2 Hawking-Page Transition for Toroidal Black Holes 

For spherical black holes in AdS with metric (5.10), it is well known [78, 79] that for 
small enough values of a the black hole is unstable to decay by Hawking radiation. The 
crossover point is when the temperature is of order i? -1 . 31 This phenomenon provides 
a natural endpoint for the type of decay we discussed in the previous section; the black 
hole will very slowly radiate energy up the throat and into Minkowski space until it 
reaches the critical temperature, after which it will decay essentially immediately into 
low energy quanta in the AdS-region of the throat. Does something similar occur for 
our toroidal black hole as well? The answer is yes, but as we now describe the analysis 
has a few details that differ from the spherical case. This transition has been previously 
discussed by [80]. 

Far down the throat, our geometry becomes a compactified version of a general 
solution called a black AdS brane. In AdSd the metric for this solution is 32 

dr 2 

ds 2 = r 2 (-f(r)dt 2 + dx 2 ) + ——, (5.29) 

with 

AO = 1 - ^T- ( 5 - 3 °) 

In the spherical case the action of the Euclidean version of the solution (5.10) can 
be compared to the action of another Euclidean solution which has the form (5.10) 
again but with a = 0. In both cases the Euclidean time r = it is compactified. The 
competition between these two solutions is the source of the Hawking-Page transition. 
In the toroidal setting there is an analogous construction of a second solution, where 
we set a = 0, but it turns out always be subdominant to the black brane solution 
(5.29). There is another solution however with the same boundary conditions; we can 
take the "emblackening factor" f(r) and move it from in front of —dt 2 to one of the 

31 More carefully this is the temperature below which the black hole no longer dominates the thermal 
ensemble. In the microcanonical ensemble the energy below which a single black hole is actually 
unstable is lower by some power of R/i p , but this distinction will not be important for us for reasons 
we explain below. We thank Juan Maldacena for explaining this distinction to us. 

32 In this section we set the AdS radius R to one. 
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planar coordinates: 33 

ds 2 = r 2 ( dt 2 + f^ dx 2 + + — — . (5.31) 

v i r l ][r) 

This geometry is sometimes called the AdS Soliton. It has no region behind the "hori- 
zon" at r = ?~o, instead it caps off smoothly provided that we choose the correct value 
of ro as a function of the periodicity L. When we continue to Euclidean time we can 
set the r periodicity of the AdS soliton (5.31) freely, but for the black brane (5.29) we 
must choose r to be consistent with the r periodicity for the Euclidean geometry to 
be smooth. An important subtlety is that for a given periodicity of the circle at the 
boundary, the correct value of r and also the coordinate periodicity of r are different 
in the two solutions. 34 The Euclidean action is 

S= ~utc{J ddx V=9{R+{d-l)(d-2)) + 2 J d^x^K^j, (5.32) 

where 7 is the determinant of the induced metric at the boundary and K is the trace 
of the extrinsic curvature. Making the action finite involves cutting off the geometry at 
some large r = r c , and then carefully matching the boundary geometry on the regulator 
surface in the two cases. We will not present the details explicitly here since they are 
fairly standard in the literature. (See [79, 81] for examples.) The result is that the 
finite parts of the actions are 

-Sbb = T7 ^( -^-r) d 1 (LT) d - 2 (5.33) 



16nG \d-l 



for the black brane and 



- s - = i6b(^)" 1(iT »" (534) 

for the AdS soliton. Thus at high temperatures compared to L~ l the black brane wins 
while at low temperatures the AdS soliton wins. This then is the effect that we want; 
as the black hole radiates it will eventually undergo a transition to some other type 
of geometry with no horizon and Alice will no longer be able to test AMPS. The dual 



33 Had we chosen supersymmetric boundary conditions in the spatial directions this solution would 
not exist. 

34 Actually in Euclidean signature they are really just the same solution with different parameters, 
so the calculation only needs to be done once. 
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field theory interpretation of this is clear; it is the same large N phase transition as in 
the spherical case, just studied with different spatial topology. 35 

5.3.3 Time Scales 

In this section we work out the time scales for sending signals down the throat from the 
Minkowski region to the vicinity of the black hole horizon at tq in (5.20), as well as the 
time to evaporate down to the transition temperature just discussed in the previous 
section. Since we will ultimately compare these time scales to the computation and 
recurrence times, both of which are exponential in the entropy of the black hole, we 
will be focused on extracting only the pieces of them which are exponential in entropy. 
Specifically we will write Sq to mean the entropy of the black hole just after the Page 
time, so that it is also roughly the size of the radiation and one half of the entropy of 
the original black hole. If we had started the computer any later it would just have 
made the task more difficult, and we want to give Alice a fair shot. 

We will see shortly that for the decay to be slow enough for Alice to have a chance 
at computing, we will need the temperature to be exponentially small in the entropy, 
perhaps with some coefficient in front of So in the exponent. From equation (5.25) this 
means we will need r to be exponentially small. From equation (5.24) we see that to 
keep the entropy fixed in the same limit we will need L to be exponentially large such 
that Ltq is fixed. Looking at (5.27) we see that the energy above extremality will then 
be exponentially small througout the decay process. Having fixed So we can also derive 
an interesting bound on the AdS radius R in Planck units: we know that we must have 
T > L v to avoid starting the computation below the phase transition discussed in the 
previous section, so we must have 

(™)« = ,5,5) 

Thus the AdS radius in Planck units is bounded by a polynomial in Sq and we can 
neglect it in most equations. 

To understand how hard it is to send some particular quanta down the throat, we 
need to compute its absorption probability P a b s - This probability will be a function 
of the frequency u of the quanta of interest, and in the limit uR <d 1 it can often 

35 As in the spherical case, in the microcanonical ensemble the energy at which the black hole is 
actually unstable is somewhat lower than this. This decrease depends only on the parameters of the 
AdS region of the geometry however, and is insensitive to the total length of the throat. Since we 
will need the total length of the throat to exponential in the entropy in the following section, which 
will lead to an exponentially long decay time, the additional time to get from T ~ L _1 to the actual 
instability will be negligible compared to the time to get down to T ~ in the first place. 
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be computed analytically [82]. For massless scalars in the extremal black three brane 
(5.17) this problem was studied by Klebanov in [83]. 36 A simple generalization of his 
result shows that for a quanta with frequency u, angular quantum number £ on the S 5 , 
and momentum k in the planar direction, the absorption probability is 

P abs ~ (Vw 2 - k 2 Rj . (5.36) 

In sending signals down the throat, we need to use low enough energy to avoid our 
signals backreacting significantly on the throat. Certainly a necessary condition is that 
we need u < E, and looking at (5.28) this means we need u ~ T up to a power 
of the entropy, which as usual we ignore. This means that we must use quanta of 
exponentially low energy to send any messages; the absorption probability (5.36) will 
thus be exponentially small. It is still possible to send a message, but we must try 
many times. Since it takes an energy u~ l just to produce a message of energy u>, to 
have any chance of success sending the message we need a time of order 

1 

tmsg ~ — 5 • (5.37) 

Clearly we have the best chance of sending a message using scalars if we set k = i = 0. 
Other types of communication will have different absorption probabilities. For example, 
we show in appendix A that the absorption probability for sending messages down 
a string threading the throat by moving the string along the S 5 is proportional to 
(uR) 2 . Apparently this is a better method of communication than the massless scalar, 
although we will see it is still not good enough to be of use to Alice. 37 In general we 
will parametrize low-energy absorption probabilities as 

P abs = (ojR)\ (5.38) 

so neglecting all factors polynomial in So we can estimate 

1 

t mS g ~ j^+T' ( 5 - 39 ) 

The units here are provided either by powers of l v or R, it doesn't much matter. To be 
concrete we can evaluate the temperature at the same time that we defined So, which 
was just after the Page time. 



36 Computing these absorption factors in the extremal background is an excellent approximation for 
our purposes since tq -C R and the potential is very close to extremal. 



37 



We thank Joe Polchinski for suggesting a stringy telephone. 
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Absorption probabilities are also important in understanding how long it takes for 
the black hole to evaporate. As Hawking showed in his original paper [84] the energy 
flux out of a black hole is 

dE s-^ f duuP abs (uj,n) 

l~t=-l^ e*»-l ' (5 ' 40) 

n 

where the sum on n is over different modes. It is often the case that only a particular 
mode contributes significantly, for example for Schwarzschild black holes it is only the 
i = mode. The low energy absorption cross section for Schwarzschild is proportional 
to (2GMu) 2 , from which one can use (5.40) to motivate the usual "Stefan-Boltzmann" 
assumption for the decay rate. In that case the low energy approximation breaks down 
before the peak of the integrand and numerical analysis is necessary to compute the 
pref actor correctly [58], but for us the temperature is very low compared to R^ 1 so 
using the low energy approximation for P a b s throughout is justified. The intuition of 
(5.40) is quite simple; the thermal factor is just the expected occupation number of 
the near horizon modes, and by a basic fact about one-dimensional scattering theory 
the probability of absorption in from the outside is the same as the probability of 
transmission out from the inside. 

For the absorption probability (5.36) the decay is dominated by £ = modes, but 
it is necessary to include modes of low but finite k. Roughly k is quantized in units of 
1/L, and since we are interested in the region where TL > 1 we will have u ^> L^ 1 . 
We can then convert the sum over discrete modes into an integral over k and write 

d_E = _ I tudu s f d*k (^-k^R 8 _ l3rS [ du a; 12 _ L z R s T i3 
dt J 2ir J^i^f eP»-\ J 2tt eft* - 1 

(5.41) 

More generally we can write 

dt J 2tt ef" - 1 ' 1 ' 

where the parameter b is the same as in (5.38) and the parameter a accounts the 
phenomenon just encountered for the scalar. For the string we discuss in appendix A 
we have a = 0, b = 2. 

To find the evaporation time we need to integrate (5.42) to find the energy as a 
function of time. As the decay proceeds L cannot change because it is fixed by the 
boundary conditions at r — > 00, so it will be r that gradually decreases. We integrate 

from initial energy E = SqT down to final energy E = L' 1 . The details of the 

integral depend on whether a + b — 2 is positive, negative, or zero, but the final power 
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of T does not. Indeed we find 

tevap ~ J^f, (5.43) 

where again we can make up the dimensions with either R or £ p without affecting the 
exponent in the entropy. By comparing (5.43) to (5.39) we see that the evaporation 
time is always the same order in T as is the time to send any signal at all! 38 Which one 
is bigger depends on the prefactors we omitted, but there are definitely cases where 
tmsg < tevap so this fact by itself, although certainly troubling, is not enough to kill the 
experiment. 

In section 3.2 we argued that Alice's quantum computation takes a time of order 
e 2S 39 p or now we w jjj k e a little more general and write this as 

t comp ~ e aSo . (5.44) 

The recurrence time is basically e s ° , but we need to include a red-shift factor to account 
for the extremely low energy of the states involved in the recurrences. Thus 

t rec ~ r-V°. (5.45) 

With these estimates we are finally in a position to assess the viability of Alice's 
experiment. For the computation to finish before the black hole evaporates we need 
tcomp < tevap, which implies 

T < e~^ s °. (5.46) 

This confirms our earlier claim that the temperature needs to be exponentially small 
in the entropy. To be able to send a message down the throat in less than a recurrence 
time we need t msg < t rec , which implies 

T > e~* s °. (5.47) 

The condition that t comp < t rec gives 

It is straightforward to see that all three of these can be satisfied only if 

b < (5.49) 
a — 1 



38 It is the same b that appears in both because whichever b is smallest will control the decay rate 
and also give the highest probability of success for sending messages. 

39 The reader should not be confused by us writing e 2S ° here and 2 2n there. Previously n was the 
entropy in base-2 logarithm while So here is the entropy in the natural logarithm. 
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In the text we saw that if Ur is completely general then a = 2, in which case the 
experiment can be done only if b < 1. Neither the string nor the free scalar field are 
close to this, and actually there is a simple argument that no scalar field of any kind 
can satisfy this inequality The coefficient b in the absorption factor is related to the 
conformal dimension of the operator that the scalar couples to in the CFT dual as 
b = 2 A [85], so the unitarity bound A > 1 in four dimensions precludes b < 2. A 
similar argument can perhaps be constructed for the defect operators that couple to 
the ends of general strings but we have not tried to do so. 

It is interesting that were the computer able to decrease a it would make it easier 
to satisfy these inequalities. In fact we don't see any particular reason why improved 
algorithms shouldn't be able to use special features of the black hole dynamics to 
decrease a by some order one factor. If a could be decreased below 3/2, the string might 
become an effective method for communicating down the throat. We stress however 
that this is not sufficient for doing the AMPS experiment, it is only necessary. For 
one thing even if the "true" a could be decreased by algorithms, our discussion below 
equation (3.2) suggests that because of coarse-graining a should be increased by some 
order one factor. More significantly, being able to send one piece of classical information 
is not enough to do the strongest version of the AMPS experiment. That requires us 
to send a particular quantum state which purifies B. Preserving the coherence of this 
state would presumably require some sort of apparatus (also made out of fluctuations 
on the string) which would also have to make it through the barrier, and even without 
the apparatus we probably want k to be at least a little bit bigger than one to be able 
to build up any kind of statistics. Getting all of these things to make it through the 
barrier at once probably requires us to raise t msg by some order one power, which would 
help compensate for an a that has been decreased by clever algorithms. Even this is 
not enough however; the purification of B is more likely to be partially reflected than 
to get all the way through, which means that it will be partially reflected many times 
before most of it gets through. In fact getting slightly more than half of it through is 
enough since somebody living down the throat can do error correction to restore the 
other half, but each time that more than half of it is reflected the person outside will 
need to do error correction before trying to send it again. This will usually succeed, 
but there is a small So-independent probability it will fail. Since we need the correction 
procedure to work every time in order to continue sending the correct state, this means 
that eventually it will fail. For these reasons we are quite confident that the "strong" 
AMPS experiment where Alice carries the purification of the black hole in with her 
can't be done in this setup. 

There is a weaker version of the AMPS experiment where the state of Rb is mea- 
sured out in the Minkowski region at the end of the quantum computation and what 
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is sent down the throat to Alice is just a classical record of the result of that mea- 
surement. Because classical information can be cloned this makes it easier to send the 
record to Alice; she doesn't have to worry about error correcting and she can send mul- 
tiple copies at once. Sending exponentially many copies of the information down the 
throat at once is dangerous from a back-reaction point of view however. For example, 
to do this using strings would require an exponential number of strings, all parallel 
and located at different values of x, which would become extremely dense down the 
throat; the distances between the strings would become exponentially sub-Planckian. 
For massless scalars the total number of modes we can use without backreaction is, at 
the level of exponential factors, of order (LT) 3 . Since L ~ T _1 , using all of these modes 
at once is of no help in trying to beat the exponential in t msg . Given the possibility of 
a < 2 we are not decisively able to rule out this 'weak" AMPS experiment, but at a 
minimum we are comfortable interpreting this section as casting serious doubt on the 
feasibility of using an AdS throat to facilitate an AMPS experiment. 

6 The Structure of the Hilbert Space 

The main argument of this paper is now complete, and although the paper is already 
long, we can't resist making a few comments about the possible implications of our 
results for how to think about the interior of a black hole. In the introduction we briefly 
discussed two alternatives, "strong complementarity" and "standard complementarity" , 
for how to think about Alice the infalling observer's quantum mechanics. It is very 
important to decide which, if either, of these frameworks is the correct way to think 
about black hole interiors. In this section we assess the status of each in turn in light 
of our computational arguments. This section has substantial overlap with a paper by 
Susskind [86] appearing simultaneously with this one, and which explains some of these 
ideas in more detail. 

The least restrictive idea for how to think about Alice is to imagine that she has 
her own quantum mechanics, possibly approximate since she encounters a singularity 
later, which is a priori independent of the quantum mechanics of an observer at infinity 
like Charlie. This type of theory has been argued for by Banks and Fischler for a while, 
who attempt to realize it precisely as quantum mechanics with no approximations for 
anybody, in a formalism called "holographic spacetime" [8, 9]. The basic idea will be 
discussed using as illustration figure 1 above. In this framework it seems necessary for 
consistency [12, 13] for Alice and Charlie to agree about the results of measurements in 
the green region of the figure. It is fairly clear in this setup that if Alice indeed cannot 
decode the radiation before jumping in, we are free to change her state in a way that 
produces no observable contradiction with the fact that Charlie, who is able to decode, 
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will later conclude that that Rb was entangled with B. Charlie, not having access to 
the degrees of freedom behind the horizon, will not be able to check that B is entangled 
with A. Moreover a previous objection to strong complementarity [1] that it required 
some sort of discontinuity in the experiences of a sequence of observers who jump in 
at different times is not relevant, since the only thing that matters in whether or not 
Rb can be decoded is whether or not the observer ever falls in. So Alice's inability to 
decode apparently to allows strong complementarity to be consistent without firewalls. 

Although strong complementarity is in some sense straightforward, it is rather 
unsatisfying. Each observer having her own description of the universe, approximate or 
not according to taste, and with no clear precise relationship between them, seems to us 
like a rather inelegant fundamental framework. To quote Douglas Stanford it seems like 
"making it up". In particular it is to be contrasted with AdS/CFT [38, 39, 87], where 
there is a single Hilbert space and set of operators which is conventionally understood 
to describe all of the physics in AdS space within a single sharp framework. It would 
be reassuring if strong complementarity could be set into such a framework, in which 
its ambiguities would be understood as having to do with measurements that are not 
precisely well-defined. Without such an embedding, it would seem like we have taken 
a step backwards. Even in AdS space the CFT would not be a complete description of 
the physics of an infalling observer. 40 As we discussed above this embedding has been 
called A = Rb in the context of firewalls; in the remainder of this section we sketch a 
basic proposal for how it could work more precisely. 

We imagine that there is a single Hilbert space H, which we will think of as the 
CFT Hilbert space in an AdS setup to be concrete. To understand Charlie's physics on 
some spatial slice like the black one in figure 1, we need to compute expectation values 
of some set of operators C n , which approximately commute. 41 Indeed there is a fairly 
well-known construction [45] for constructing these operators in some cases, which we 
illustrate in figure 6. 

Our proposal for the interior is then that there is another set of operators, which 
we will call A n 's, which are also mutually commuting with each other and whose expec- 
tation values in the same (Heisenberg picture) state used by Charlie describe Alice's 
experience on the red slice in figure 1. Some of the A^s are interpreted by Alice as 
being outside the horizon, and she can also try to construct them using the method 
of [45]. Consistency then requires that these A n 's are equal to the appropriate C n 's to 
prevent disagreement between Alice and Charlie about events in the green region of 

40 The point of view of that follows here is in some respects close to that of [14], although we disagree 
with their assertion that their construction by itself addresses the argument of AMPS. 

It is interesting to ask whether approximately commute means up to powers in TV -1 or up to 
exponentially small terms in N. We are agnostic about this here. 
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Figure 6. The construction of bulk operators by Kabat, Lifschytz, and Lowe for an AdS 
black hole. Here the yellow region is a lightcone ending on the operator and extending out 
to the boundary, and the operator is constructed by integrating a CFT operator over the 
boundary of the yellow region against a kernel that depends on the position of the operator. 
Note that as the operator approaches the horizon the operator becomes sensitive to the entire 
history on the boundary and thus to the details of the quantum state. 

figure 1. Others of the A n 's are interpreted by Alice as being behind the horizon and, 
as shown in the figure the construction of [45], breaks down in that case. These A n 's 
naively do not seem to have low energy interpretations for Charlie. From the AMPS 
argument however we know that to have a smooth horizon it must be that there are 
some operators just outside the horizon, which act on what we've been calling B, mea- 
surements of which need to be close to perfectly correlated measurements of some of 
the behind-the-horizon A^s. Before the Page time, none of Charlie's C n 's are expected 
to have this correlation with the B operators, and as recently argued by Susskind, 
Verlinde, and Verlinde [16, 46] Charlie can then interpret the A n 's as just being some 
complicated mess acting on the remaining black hole. After the Page time, however, 
Charlie expects the operators acting on B to be perfectly correlated with non-local 
C n 's acting on what we've called Rb in the radiation. So it must be that from Char- 
lie's point of view the appropriate A n 's now act on the complicated subfactor of the 
radiation which purines B. Hence the name A = Rb- This clearly is rather non-local, 
but as shown in the figure the breakdown of the construction of [45] suggests that the 
construction of operators behind the horizon does indeed depend on sensitive details 
of the state. 

This idea is very confusing to interpret however if Alice is able to decode the 
Hawking radiation, because she then has two low energy observables which she wants 
to identify with the SAME quantum mechanical operator on the same Hilbert space. 
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This is sometimes called cloning, although it isn't really because the theory is quantum 
mechanical and thus doesn't clone, but it seems like a rather serious problem for the 
physical interpretation of quantum mechanics. By doing low-energy manipulations of 
the Hawking radiation Alice would be able to construct a situation where looking at 
some localized piece of the Hawking radiation far from the black hole is indistinguishable 
from looking behind the horizon. At a minimum this type of observable bizarreness 
would allow acausal communication, and in any event it doesn't seem particularly less 
crazy than the idea that there is a firewall. In the context of the discussion of this 
paper however, if Alice in principle cannot decode Rb then there does not seem to be 
any such problem with interpreting A n as being behind the horizon from Alice's point 
of view and out in the radiation from Charlie's point of view. This is something like 
strong complementarity, but now realized in a single quantum theory. 
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A The Absorption Probability for a Nambu-Goto String 

In this section we compute the low-energy absorption probability for transverse oscilla- 
tions of a string stretching down the extremal black three brane geometry (5.17). This 
has been previously been computed by Maldacena and Callan [88]; our method is the 
same as in [83] for a massless scalar. The Nambu-Goto action is 

S NG = -2^2 j ^W-det (G MN d t X M d l X N ), (A.l) 

and parametrizing the string in static gauge and considering only oscillations in the S 5 
direction we have 

t = T 
r = a 
x = 

6 = 0(t,ct). 

Linearizing the action in 9 we find 

S = ^pJ d ° d ™ 2 { Z (°)0 2 - d ' 2 ) > ( A -2) 
and looking at modes of definite frequency u the equation of motion is 

0" + -0' + Z(a)u 2 e = 0. (A.3) 
a 

Defining p = ua, the equation becomes 



2 nl I (uR 



r 



e" + 1 1 + ( — ) | ^ = o. (A. i.) 

For p ^> uR the solution is approximately 

p ip p -ip 

6 = A— + B— , (A.5) 
P P 

while for p uR the solution is approximately 

6 = Ae^+Be^^. (A.6) 

When uoR C 1 we can also find an approximate solution for (uR)^ 3 < /) C 1 by 
keeping only the derivative terms in (A.4): 

9 = -+D. (A.7) 
P 
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Since this range overlaps with the other two ranges, we can use this solution to connect 
them together. Since we are computing an absorption probability we want B = 0, 
which means that matching the two "inner" regions gives 



C = ico 2 R 2 A 



D 



A. 



(A.8) 



Matching the "outer" two regions gives 



D 



C = A + B 
= i(A-B), 



(A.9) 



so when uR < 1 we have 



A 



B 




(A.10) 



Finally to compute the absorption probability we inspect equations (A. 5) and (A. 6), 
switching back from p to r, and compute the square of the ratio of the coefficients of 
the waves. The result is 



consistent with [88]. One could also study oscillations along the brane direction, 
according to [89] these give an absorption probability proportional to (uR) 4 . 
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